scrapy框架入门

1.创建scrapy项目:
终端输入 scrapy startproject 项目名称
 
在spiders文件夹下创建py文件
scrapy genspider baidu http://www.baidu.com
 
settings.py   
ROBOTSTXT_OBEY = False
 
4.运行爬虫文件:
scrapy crawl 爬虫名称

scrapy crawl baidu

 

5.爬汽车之家汽车名称和价格:

import scrapy


class CarSpider(scrapy.Spider):
    name = 'car'
    allowed_domains = ['https://car.autohome.com.cn/price/brand-15.html']
    start_urls = ['https://car.autohome.com.cn/price/brand-15.html']

    def parse(self, response):
        print("汽车之家")
        name = response.xpath('//div[@class="main-title"]/a/text()')
        price = response.xpath('//span[@class="lever-price red"]/span/text()')
        for i in range(len(name)):
            # 提取selector对象的值
            print(name[i].extract())
            print(price[i].extract())
        pass

  scrapy shell:

在pycharm终端    scrapy shell www.baidu.com

获取百度一下value值

response.xpath('//input[@id="su"]/@value').extract_first()

 

 

 

 

posted @ 2023-10-04 00:53  sgj191024  阅读(22)  评论(0)    收藏  举报