摘要:
# -*- coding: utf-8 -*-import scrapyfrom huyaAll1.items import Huyaall1Itemclass HuyaSpider(scrapy.Spider): name = 'huya' # allowed_domains = ['www.xx 阅读全文
posted @ 2020-03-07 01:30
干it的小张
阅读(283)
评论(0)
推荐(0)
摘要:
# -*- coding: utf-8 -*-# Define here the models for your spider middleware## See documentation in:# https://docs.scrapy.org/en/latest/topics/spider-mi 阅读全文
posted @ 2020-03-07 01:27
干it的小张
阅读(232)
评论(0)
推荐(0)
摘要:
movie.py虫子 # -*- coding: utf-8 -*-import scrapyfrom moviePro1.items import Moviepro1Itemclass MovieSpider(scrapy.Spider): name = 'movie' # allowed_dom 阅读全文
posted @ 2020-03-07 01:25
干it的小张
阅读(8105)
评论(0)
推荐(0)
摘要:
创建项目: scrapy startproject wangyi 创建虫子: scrapy genspider wangyi www.xxx.com :创建爬虫文件 执行:scrapy crawl spiderName wangyi.py 虫子 # -*- coding: utf-8 -*-impo 阅读全文
posted @ 2020-03-07 01:22
干it的小张
阅读(414)
评论(0)
推荐(0)
摘要:
- 管道的持久化存储: - 数据解析(爬虫类) - 将解析的数据封装到item类型的对象中(爬虫类) - 将item提交给管道:yield item(爬虫类) - 在官大类的process_item中接收item对象并且进行任意形式的持久化存储操作(管道类) - 在配置文件中开启管道 - 细节: - 阅读全文
posted @ 2020-03-07 01:15
干it的小张
阅读(216)
评论(0)
推荐(0)

浙公网安备 33010602011771号