Scrapy:Python实现scrapy框架爬虫两个网址下载网页内容信息——Jason niu

import scrapy
class DmozSpider(scrapy.Spider): 
    name ="dmoz" 
    allowed_domains = ["dmoz.org"] 
    start_urls = [
        "https://dmoztools.net/Computers/Programming/Languages/Python/Resources/"
        "https://dmoztools.net/Computers/Programming/Languages/Python/Books/"
        ]
    def parse(self,response): 
        filename = response.url.split("/")[-2] 
        with open(filename, 'wb') as f:  
            f.write(response.body) 

  

posted @ 2018-03-17 22:53  一个处女座的程序猿  阅读(221)  评论(0编辑  收藏  举报