CrawlSpider

用两个规则替换预定义的规则变量,一个用于水平,一个用于垂直爬

rules = (
Rule(LinkExtractor(restrict_xpaths='//*[contains(@class,"next")]')), Rule(LinkExtractor(restrict_xpaths='//*[@itemprop="url"]'),
callback='parse_item')
)

 

posted @ 2017-12-13 09:49  不可叽叽歪歪  阅读(85)  评论(0编辑  收藏  举报