爬虫requests库
requests原理
- requests的底层实现就是urlib
- requests在Python2和Python3通用,方法完全- -样
- requests简单易用
- requests能够自动帮助我们解压(gzip压缩的)网页内容
requests的作用
requests中解决编码的方法
response.content.decode() # 不写,默认是以 utf-8 方式解码
response.content.decode('gbk')
response.encoding = 'utf-8'
response.text
response.text和response.content的区别
- response.text
- 类型: str
- 修改编码方式: response.encoding = 'utf-8'
- response.content
- 类型: bytes
- 修改编码方式: response.content.decode('utf-8')
发送简单的请求
response = requests.get(url)
response的常用方法:
response.text # 获取网页源代码
response.content # 同样是获取网页源代码
response.status_code # 获取状态码
response.request.headers # 返回请求头
response.headers # 返回响应头
下载图片
import requests
response = requests.get('https://www.baidu.com/ing/bd_logo1.png?where=super')
with open('baidu.png','wb') as f:
f.urite(response .content)