scrapy

scrapy 问题汇总

scrapy 如何增加代理

  1. 在middlewares.py中新增一个中间件,代码如下
class MyProxySpiderMiddleware(object):
    def process_request(self, request, spider):
        """请求之前设置代理"""
        proxy = random.choice(IPOOL)
        request.meta['proxy'] = 'http://' + proxy
        return None
  1. 在setting.py 中配置代理池:
IPOOL = [
    '223.223.23.216:8085',
    '111.3.118.247:30001',
    '112.14.47.6:52024',
    '118.163.120.181:58837',
    '223.82.60.202:8060',
    '61.216.156.222:60808',
    '223.82.60.202:8060',
    '122.9.101.6:8888',
    '47.106.105.236:80',
    '121.13.252.62:41564',
    '118.163.120.181:58837',
]
  1. 在setting.py 中启动中间件,顺序高于其他的中间件(后面的数字低于其他的中间件,数字越低越先执行)
DOWNLOADER_MIDDLEWARES = {
    'ggzy_deal.middlewares.GgzyDealDownloaderMiddleware': 543,
    'ggzy_deal.middlewares.MyProxySpiderMiddleware': 125
}
posted @ 2022-10-14 22:07  Hiraly  阅读(177)  评论(0)    收藏  举报