随笔分类 -  爬虫

学习练习爬虫
摘要:1。安装SCRAPY2。进入CMD:执行:SCRAPY显示: Scrapy 1.8.0 - no active project Usage: scrapy <command> [options] [args] Available commands: bench Run quick benchmark 阅读全文
posted @ 2020-02-05 06:45 myrj 阅读(507) 评论(0) 推荐(0)
摘要:中文帮助进入文件夹:1。scrapy startproject mingzi #建立爬虫项目2.scrapy genspider -t crawl ygdy8 ygdy8.com #建立指定爬虫:ygdy8为爬虫名称,ygdy8.com:爬虫允许的范围,即只在这个范围内爬取 3.scrapy cra 阅读全文
posted @ 2020-02-01 15:41 myrj 阅读(239) 评论(0) 推荐(0)
摘要:import random import requests def get_htmla(url): aui=0 while aui==0: try: header={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537. 阅读全文
posted @ 2020-01-30 20:45 myrj 阅读(794) 评论(0) 推荐(0)
摘要:import requestsimport re txt='<a href="https://www.vgirls.com/13404.html" class="list-title text-md h-2x" target="_blank">想把夏日的阳光寄给冬日的你</a>'urla=re.fi 阅读全文
posted @ 2020-01-30 19:56 myrj 阅读(2036) 评论(0) 推荐(0)