scrapy - 随笔分类 - CHVV

摘要：settings.py 项目地址：https://github.com/CH-chen/jdbook 阅读全文

posted @ 2019-02-12 14:31 CHVV 阅读(497) 评论(0) 推荐(0)

抽屉爬取

摘要：# -*- coding: utf-8 -*- import scrapy from scrapy.http.cookies import CookieJar from scrapy.http import Request class ChoutiSpider(scrapy.Spider): name = 'chouti' allowed_domains = ['chouti.c... 阅读全文

posted @ 2019-02-09 11:49 CHVV 阅读(203) 评论(0) 推荐(0)

scrapy项目renrencookie

摘要：settings 项目地址：https://github.com/CH-chen/renrencookie 阅读全文

posted @ 2019-01-29 06:28 CHVV 阅读(209) 评论(0) 推荐(0)

scrapy项目suningbook

摘要：pipelines.py settings 项目地址：https://github.com/CH-chen/suningbook 阅读全文

posted @ 2019-01-29 06:24 CHVV 阅读(225) 评论(0) 推荐(0)

scrapy项目4

摘要：pipelines.py settings 项目地址：https://github.com/CH-chen/sun0769 阅读全文

posted @ 2019-01-29 06:16 CHVV 阅读(241) 评论(0) 推荐(0)

scrapy项目3

摘要：pipelines.py items,py settings.py 项目地址：https://github.com/CH-chen/tencent 阅读全文

posted @ 2019-01-29 06:06 CHVV 阅读(239) 评论(0) 推荐(0)

scrapy项目2

摘要：pipelines.py settings.py 注意点阅读全文

posted @ 2019-01-29 05:57 CHVV 阅读(219) 评论(0) 推荐(0)

scrapy项目1

摘要：pipelines.py settings.py 阅读全文

posted @ 2019-01-29 05:49 CHVV 阅读(232) 评论(0) 推荐(0)

scrapy流程

摘要：scrapy 命令： scrapy startproject xx(爬虫目录) 创建爬虫目录 cd xx 进入目录 scrapy genspilder chouti(爬虫名称) chouti.com(起始url) 然后编写启动爬虫项目： scrapy crawl chouti(爬虫名称) --nolog(不看默认日志) # n... 阅读全文

posted @ 2019-01-29 05:42 CHVV 阅读(175) 评论(0) 推荐(0)

xpath使用

摘要：//a[@class="n"]/@href 获取下一页网址//a[text()="下一页>"] 根据文本定位 //div[@class="indent"]/div/table 获取所有table,一级一级选//div[@class="indent"]//table 获取所有table//div[@c 阅读全文

posted @ 2019-01-21 14:25 CHVV 阅读(366) 评论(0) 推荐(0)

CHVV

随笔分类 - scrapy

公告