13、爬虫之协程

1 协程

首先我们需要知道的是requests是同步的方法。而我们若想使用协程，写的方法都尽量不是使用同步的方法。
因些我们，选择使用一个新的模块库`aiohttp

1.1 安装

pip install aiohttp

1.2 快速开始

import aiohttp
import asyncio

async def main():
    async with aiohttp.ClientSession() as session:
        async with session.get('http://python.org') as response:

            print("Status:", response.status)
            print("Content-type:", response.headers['content-type'])

            html = await response.text()
            print("Body:", html[:15], "...")

loop = asyncio.get_event_loop()
loop.run_until_complete(main())

1.3 clientSession的使用

header设置

headers={"Authorization": "Basic bG9naW46cGFzcw=="}
async with aiohttp.ClientSession(headers=headers) as session:
    async with session.get("http://httpbin.org/headers") as r:
        json_body = await r.json()
        assert json_body['headers']['Authorization'] == \
            'Basic bG9naW46cGFzcw=='

Cookie的使用

url = 'http://httpbin.org/cookies'
cookies = {'cookies_are': 'working'}
async with ClientSession(cookies=cookies) as session:
    async with session.get(url) as resp:
        assert await resp.json() == {
           "cookies": {"cookies_are": "working"}}

Proxy的使用

async with aiohttp.ClientSession() as session:
    async with session.get("http://python.org",
                           proxy="http://proxy.com") as resp:
        print(resp.status)

1.4 Response的使用

属性
- status 状态码
- url url地址
- cookies cookie内容
- headers 响应头
方法
- read() 读bytes类型
- text() 读文本内容

posted @ 2022-02-26 19:39 齐天_大圣阅读(134) 评论(0) 收藏举报

刷新页面返回顶部

13、爬虫之协程

1 协程

1.1 安装

1.2 快速开始

1.3 clientSession的使用

1.4 Response的使用

公告