scrapy item pipeline

item pipeline

process_item(self, item, spider) #这个是所有pipeline都必须要有的方法
在这个方法下再继续编辑具体怎么处理

另可以添加别的方法

open_spider(self, spider)  This method is called when the spider is opened.
close_spider(self, spider) This method is called when the spider is closed.
from_crawler(cls, crawler)

open_spider(self, spider)：在spider打开时（数据爬取前）调用该函数，该函数通常用于数据爬取前的某些初始化工作，如打开数据库连接；
close_spider(self, spider)：在spider关闭时（数据爬取后）调用该函数，该函数通常用于数据爬取前的清理工作，如关闭数据库连接；

from_crawler(cls, crawler)：类方法，其返回一个ItemPipeline对象，如果定义了该方法，那么scrapy会通过该方法创建ItemPipeline对象；通常，在该方法中通过crawler.settings获取项目的配置文件，根据配置生成对象

 @classmethod
    def from_crawler(cls, crawler):
        file_name = crawler.settings.get('FILE_NAME')
        # file_name = scrapy.conf.settings['FILE_NAME'] #这种方式也可以获取到配置
        return cls(file_name)

作者：喵帕斯0_0 链接：https://www.jianshu.com/p/256bc96c9b6d 来源：简书 简书著作权归作者所有，任何形式的转载都请联系作者获得授权并注明出处。

enabled pipelines []是空的，虽然定义了正确的pipeline名字，但是filepipeline ，用了IMAGES_STORE,不匹配，所以直接就没有接入filepipeline

posted @ 2019-03-05 21:05 oooooolr 阅读(197) 评论(0) 收藏举报

刷新页面返回顶部