案例-抽屉新热榜:xpath

 

网址: https://dig.chouti.com/

 

 

xpath代码:

import requests
import json
from lxml import etree
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"}
url  = 'https://dig.chouti.com/'

resp = requests.get(url, headers=headers)
resp.encoding = 'UTF-8'
html_tree = etree.HTML(resp.text)
data = html_tree.xpath('//div[@class="main"]/div[2]/div[1]/div')
for item in data:
    print("来源==>", item.xpath('.//div[@class="link-detail"]/div/a/span/text()')[0])
    print("地址==>", item.xpath('.//div[@class="link-detail"]/a/@href')[0])
    print("内容==>", item.xpath('.//div[@class="link-detail"]/a/text()')[0])

 

 

posted @ 2023-01-06 10:41  屠魔的少年  阅读(10)  评论(0)    收藏  举报