Python requests 模块示例代码

  Python requests 模块是一个简单优雅的 Python HTTP 库,用于发送 HTTP 请求,并获取响应,从中得到所需信息。请求网址和相关参数一般通过浏览器 “开发者工具” (F12) 中的 Network 标签下的 Fetch/XHR 选项过滤获得。本文主要是 requests 模块的一些示例代码,requests 入门教程参见 Python requests 模块-RUNOOBQuickstart - Python requests documentation。具体示例代码如下:

 

01. 搜狗搜索数据

import requests

headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36' }
url = 'https://www.sogou.com/web'
kw = input('Enter a keyword:')
params = {'query': kw}

r = requests.get(url=url, headers=headers, params=params)
page_text = r.text
with open('sogou.html', 'w', encoding='utf-8') as fp:
    fp.write(page_text)
print('Request URL: ', r.url)
print('Request Type: ', r.request)
print('Response status: ', r.status_code)
print('Over')

输出结果,如下图所示 

02. 百度翻译

import requests
import json

headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36'}
kw = input('Enter a keyword:')
data = {'kw': kw}
url = 'https://fanyi.baidu.com/sug'

r = requests.post(url=url, headers=headers, data=data)
json_data = r.json()
with open('baidu-fanyi.json', 'w', encoding='utf-8') as fp:
    json.dump(json_data, fp=fp, ensure_ascii=False)

print('Request URL: ', r.url)
print('Request Type: ', r.request)
print('Response json data: ', json_data)
print('Over')

输出结果,如下图所示 

03. 豆瓣电影排行

import requests
import json

headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36'}
params = {
    # base params
    'interval_id': '100:90',
    'action':'',
    # other params
    'type': '24',  # movie type
    'start': '0',  # start index
    'limit': '5', # quantity limit of movies returned
}
url = 'https://movie.douban.com/j/chart/top_list'

r = requests.get(url=url, headers=headers, params=params)
json_data = r.json()
with open('douban-movie-toplist.json', 'w', encoding='utf-8') as fp:
    json.dump(json_data, fp=fp, ensure_ascii=False)
print('Request URL: ', r.url)
print('Request Type: ', r.request)
print('Response json data: ', json_data)
print('Over')

 输出结果,如下图所示

04. 肯德基门店信息

import requests
import json

cityname, kw = '北京', '中关村'
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36'}
data = {
    'cname': cityname, 
    'pid': '',
    'keyword': kw,   
    'pageIndex': '1',
    'pageSize': '10',
}
url = 'http://www.kfc.com.cn/kfccda/ashx/GetStoreList.ashx?op=keyword'

r = requests.post(url=url, headers=headers, data=data)
json_data = r.json()
with open('KFC-storelist.json', 'w', encoding='utf-8') as fp:
    json.dump(json_data, fp=fp, ensure_ascii=False)
print('Request URL: ', r.url)
print('Request Type: ', r.request)
print('Response json data: ', json_data)
print('Over')

输出结果,如下图所示

05. 新浪、腾讯股票实时数据

import requests

stocklist = ['sh600000','sz000001']
keystr = ','.join(stocklist)

# Get sina stock spot data
print('=' * 30, 'sina', '='*30)
headers = {'referer': 'https://finance.sina.com.cn'}
url = 'https://hq.sinajs.cn/list=%s' % keystr
r = requests.get(url=url, headers=headers)
page_text = r.text
print('Request URL: ', r.url)
print('Request Type: ', r.request)
print('Response text data: ') 
print(page_text)

# Get tencent stock spot data
print('=' * 30, 'tencent', '='*30)
url = 'https://qt.gtimg.cn/q=%s' % keystr 
r = requests.get(url=url)
page_text = r.text
print('Request URL: ', r.url)
print('Request Type: ', r.request)
print('Response text data: ')
print(page_text)
print('Over')

 输出结果,如下图所示

06. 东方财富个股人气榜(top 100)

import requests

payload = {
    'appId': 'appId01',
    'globalId': '786e4c21-70dc-435a-93bb-38',
    'marketType':'',
    'pageNo':1,
    'pageSize':100,
}
url = 'https://emappdata.eastmoney.com/stockrank/getAllCurrentList'

r = requests.post(url, json=payload)
json_data = r.json()
print('Request URL: ', r.url)
print('Request Type: ', r.request)
print('Response json data: ', json_data)
print('Over')

 输出结果,如下图所示

 07. 雪球 SPSIOP 股票价格

import requests

headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36'}
params = {'symbol':'.SPSIOP', 'detail':'extend'}
url = 'https://stock.xueqiu.com/v5/stock/quote.json'

# 1. Create Session instance to get cookie automatically
session = requests.Session()
# 2. Get xueqiu.com cookie
session.get('https://xueqiu.com', headers=headers)
# 3. Get request with the cookie
r = session.get(url, headers=headers, params=params)
json_data = r.json()
print('Request URL: ', r.url)
print('Request Type: ', r.request)
print('Response json data: ', json_data)
print('Over')

注:由于雪球网站需要带 cookie 去访问相应网页,否则会得到错误信息 —— '遇到错误,请刷新页面或者重新登录帐号后再试' 。因此使用 requests.Session 对象,先访问雪球主页(https://xueqiu.com),得到 cookie 信息,并自动保存。然后再访问目标网址(https://stock.xueqiu.com/v5/stock/quote.json),获得所需信息。

输出结果,如下图所示

补充 1(更新于 2023.6.24)

本文 jupyter notebook 源码下载:https://github.com/klchang/python-requests-examples

 

参考资料

1. Python爬虫网络请求 requests(get、post)- CSDN博客. https://blog.csdn.net/qq_38232003/article/details/110678650

 

posted @ 2023-03-30 07:27  klchang  阅读(450)  评论(0编辑  收藏  举报