Python使用requests对网站进行爬虫

请用requests库的get()函数访问如下一个网站20次，打印返回状态，text()内容，计算text()属性和content属性所返回网页内容的长度。

import requests

def gethtml():
    url = "https://www.sogou.com/"
    try:
        r=requests.get(url,timeout=30)#每次请求超时时间为30秒
        r.raise_for_status()#HTTP请求的返回状态，整数，200表示连接成功，404表示失败
        r.encoding='utf-8'
        print('状态={}\n'.format(r.status_code))
        #print('状态正常')if r.status_code==200 else print('状态异常')
        print('text内容:\n',r.text)
        print('\ntext属性长度{}\ncontent属性长度{}\n'.format(len(r.text), len(r.content)))
    except:
        return "Error!"

for i in range(20):
    gethtml()

posted @ 2020-12-14 14:14 英魂阅读(151) 评论(0) 收藏举报

刷新页面返回顶部

英魂

Python使用requests对网站进行爬虫

公告