09 2018 档案

python爬百度文库课件

摘要：库:re;selenium;requests 源码： from selenium import webdriverimport reimport requests def open_img(items): for item in items: item = re.sub('&','&',it 阅读全文

posted @ 2018-09-17 15:41 vlj 阅读(504) 评论(0) 推荐(0)

Python爬虫的步骤和工具

摘要：#四个步骤 1.查看crawl内容的源码格式 crawl的内容可以是 url(链接），文字，图片，视频 2.请求网页源码（可能要设置）代理，限速，cookie 3.匹配用正则表达式匹配 4.保存数据文件操作 #两个基本工具（库） 1.urllib 2.requests #使用reuests库的阅读全文

posted @ 2018-09-03 19:37 vlj 阅读(649) 评论(0) 推荐(0)

vlj

09 2018 档案

公告