摘要: from boss.items import BossItem class ZhiPinSpider(CrwalSpider): name='Zhipin' allwed_domains=['zhipin.com'] start_urls=['https://www.zhipin.com/c1000 阅读全文
posted @ 2020-06-24 14:06 马蹄哒哒 阅读(398) 评论(1) 推荐(0)
摘要: from scrapy.exporters import JsonLinesItemExporter class BossPipleline(object): def __init__(self): self.fp=open('jobs.json','wb') self.exporter=JsonL 阅读全文
posted @ 2020-06-24 13:50 马蹄哒哒 阅读(259) 评论(0) 推荐(0)
摘要: import random import base64 #方法一 #设置代理ip class IpProxyDownLoadMiddleWares(object): Proxys=['178.44.170.152:8080','110.44.113.182:8080','209.126.124.73 阅读全文
posted @ 2020-06-24 12:45 马蹄哒哒 阅读(329) 评论(0) 推荐(0)