scrapy框架入门
1.创建scrapy项目:
终端输入 scrapy startproject 项目名称
在spiders文件夹下创建py文件
scrapy genspider baidu http://www.baidu.com
settings.py
ROBOTSTXT_OBEY = False
4.运行爬虫文件:
scrapy crawl 爬虫名称
scrapy crawl baidu
5.爬汽车之家汽车名称和价格:
import scrapy
class CarSpider(scrapy.Spider):
name = 'car'
allowed_domains = ['https://car.autohome.com.cn/price/brand-15.html']
start_urls = ['https://car.autohome.com.cn/price/brand-15.html']
def parse(self, response):
print("汽车之家")
name = response.xpath('//div[@class="main-title"]/a/text()')
price = response.xpath('//span[@class="lever-price red"]/span/text()')
for i in range(len(name)):
# 提取selector对象的值
print(name[i].extract())
print(price[i].extract())
pass
scrapy shell:
在pycharm终端 scrapy shell www.baidu.com
获取百度一下value值
response.xpath('//input[@id="su"]/@value').extract_first()
浙公网安备 33010602011771号