02 2021 档案
摘要:这是一个分享表情包的帖子,把里面的表情爬了出来import requestsfrom lxml import etreeimport osurl = 'https://www.zhihu.com/question/329525297/answer/1449023611'headers = {'Use
阅读全文
摘要:显示等待 import timefrom selenium import webdriverdriver = webdriver.Chrome(executable_path='C:\Program Files\Google\Chrome\Application\chromedriver.exe')
阅读全文
摘要:import timefrom selenium import webdriverdriver = webdriver.Chrome(executable_path='C:\Program Files\Google\Chrome\Application\chromedriver.exe')# dri
阅读全文
摘要:爬取贴吧的标题和链接import requestsfrom lxml import etreeclass Tieba(object): def __init__(self, name): self.url = "https://tieba.baidu.com/f?kw={}&ie=utf-8&pn=
阅读全文
摘要:今天很崩溃,安装lxml包不会安装,基础很重要啊,实践也很重要,我先去安装包了。 from lxml import etree# text = '''<div><div>'''# html = etree.HTML(text)# ret_list = html.xpath("Xpath语法规则字符串
阅读全文
摘要:# jsonpath'''jsonpath可以按照key对python字典进行批量数据提取file->设置->项目->项目解释器->+->搜索from jsonpath import jsonpathret = jsonpath(a, 'jsonpath语法规则字符串')$ 根节点。子节点。。内部任
阅读全文
摘要:import requestsimport redef login(): # session session = requests.session() # headers session.headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0;
阅读全文
摘要:超时参数timeout的使用 import requestsurl = 'https://twitter.com'response = requests.get(url, timeout=3)代理ip url = 'https://www.baidu.com'# response = request
阅读全文
摘要:'''常见的响应对象参数和方法response.url 响应的urlresponse.status_code 响应的状态码response.status.headers 响应对应的请求头response.headers 响应头response.request.cookies 响应对应请求的cooki
阅读全文
浙公网安备 33010602011771号