python requests库学习

前言:我们无论是编写渗透测试工具还是使用爬虫爬取文件都必须使用到requests库，可以说requests库是渗透测

试人员必须掌握的库，本节便来学习其的相关使用

1.1 基础使用

import requests
response = requests.get('http://www.mrbird.love/')
print(response.status_code)
response.encoding = 'utf-8'
print(response.text)

import requsets库，便可以进行使用

requests.get('http://www.mrbird.love/')请求是想目标网站发送get请求，返回的值是该请求的响应报文

response.status_code 是响应报文的状态码

response.text 是响应报文的响应体，包含响应的内容

如果返回的内容中有中文，想要正常显示，首先需要给响应报文的内容进行编码，用到的即

response.encoding = 'utf-8'

如果我们想要保存响应体:

import requests
response = requests.get('http://www.mrbird.love/')
response.encoding = 'utf-8'
content = response.text
with open('./test.html', 'w', encoding='utf-8') as f:
    f.write(content)

这种写法写入后会关闭写入流

1.2 接口与使用

import requests
word = input("Enter the word: ")
data = {
    'from': 'zh',
    'to': 'en',
    'query': word,
    'transtype': 'realtime',
    'simple_means_flag': 3,
    'sign': '304656.17697',
    'token': 'aba4817c108ad47ef420ecd41ba79548',
    'domain': 'common',
    'ts': 1709359251108
}
response = requests.post("https://fanyi.baidu.com/v2transapi?from=zh&to=en")
json_data = response.json()
print(json_data)

这样的写法在现在已经无法使用了，因为baidu的反爬机制，但仍然可以供我们学习，返回的数据是json格式，返

回的中文使用unicode进行编码，所以我们需要利用：response.json()进行解码

1.3 利用接口下载图片

import requests
url = "https://api.vvhan.com/api/acgimg"
response = requests.get(url)
# print(response.text)
with open("./pic.jpg", "wb") as f:
    f.write(response.content)

在这里我们使用的是.content而不是.text，原因是.content返回的原始的二进制字节，适用于非文本形式的数据接

受，.text会根据首部字段对文件进行解码，适用于接受可读文本类数据

wb是接受图片的模式

1.4 利用接口下载视频

import requests
from bs4 import BeautifulSoup
url = "http://tucdn.wpon.cn/api-girl/index.php"
response = requests.get(url, False)
text = response.text
soup = BeautifulSoup(text, "html.parser")
src = "http:"+soup.find("video").get("src")
content = requests.get(src)
with open("1.mp4", "wb") as f:
    f.write(content.content)

api返回的是一个video标签

我们利用BeautifulSoup获得标签中src的链接，拼接后得到视频地址，然后访问地址后下载即可。

1.5 xpath的使用

import requests
from lxml import etree
url = 'https://www.qqtn.com/tp/dmtp_1.html'
res = requests.get(url)
res.encoding = 'gbk'
tree = etree.HTML(res.text)
lis = tree.xpath('/html/body/div[5]/div[1]/ul/li')
for li in lis:
    name = li.xpath('./a/img/@alt')
    src = li.xpath('./a/img/@src')
    with open(f'./pic/{name[0]}.png', 'wb')as f:
        f.write(requests.get(src[0]).content)

其中etree.HTML(res.text)读取html内容，返回一个存储在列表的根节点的lxml对象，然后lis获取到所有符合

xpah条件的lxml对象./a/img/@alt和./a/img/@src分别获得对应对象标签中的属性值存放在列表中，最后

进行下载即可

posted @ 2024-03-02 22:00 折翼的小鸟先生阅读(33) 评论(0) 收藏举报

刷新页面返回顶部

折翼的小鸟先生

我在找东西。在找想要做的事情 ——紬文德斯

python requests库学习

python requests库学习

1.1 基础使用

1.2 接口与使用

1.3 利用接口下载图片

1.4 利用接口下载视频

1.5 xpath的使用

公告

折翼的小鸟先生

我在找东西。在找想要做的事情 ——紬 文德斯

python requests库学习

python requests库学习

1.1 基础使用

1.2 接口与使用

1.3 利用接口下载图片

1.4 利用接口下载视频

1.5 xpath的使用

公告

我在找东西。在找想要做的事情 ——紬文德斯