安装

具体请自行百度

依赖库

 

网上说pip安装会内分泌失调,我试了下还行吧,不过也遇到几个问题

解决方法

pip install -I cryptography

 

解决方法

pip install -U pyopenssl

安装成功

 

 

离线下载地址  https://pypi.org/project/Scrapy/#files

 

实战入门

import scrapy

class MovieItem(scrapy.Item):
    # define the fields for your item here like:
    name = scrapy.Field()

class MeijuSpider(scrapy.Spider):
    name = "meiju"
    allowed_domains = ["meijutt.com"]
    start_urls = ['http://www.meijutt.com/new100.html']

    def parse(self, response):
        movies = response.xpath('//ul[@class="top-list  fn-clear"]/li')
        for each_movie in movies:
            item = MovieItem()
            item['name'] = each_movie.xpath('./h5/a/@title').extract()[0]
            yield item

命令行运行

scrapy runspider test.py -o test1.json

自动生成 test.json 文件,并存入爬取内容。

 

这是最简单的代码和运行方式。