案例-抽屉新热榜：xpath

网址： https://dig.chouti.com/

xpath代码：

import requests
import json
from lxml import etree
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"}
url  = 'https://dig.chouti.com/'

resp = requests.get(url, headers=headers)
resp.encoding = 'UTF-8'
html_tree = etree.HTML(resp.text)
data = html_tree.xpath('//div[@class="main"]/div[2]/div[1]/div')
for item in data:
    print("来源==>", item.xpath('.//div[@class="link-detail"]/div/a/span/text()')[0])
    print("地址==>", item.xpath('.//div[@class="link-detail"]/a/@href')[0])
    print("内容==>", item.xpath('.//div[@class="link-detail"]/a/text()')[0])

posted @ 2023-01-06 10:41 屠魔的少年阅读(10) 评论(0) 收藏举报

刷新页面返回顶部