bs4、xpath工具使用

bs4模块
1.安装模块

pip install -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com/pypi/simple/ bs4

2.初始化BeautifulSoup对象

bs4.BeautifulSoup(html,"html.parser")

bs4方法
find(self, name=None, attrs={}, recursive=True, string=None, **kwargs)返回符合的一第个结果
find_all(self, name=None, attrs={}, recursive=True, string=None, limit=None, **kwargs返回符合的所有结果

参数	说明
name	标签名
attr	"属性:"值"

text返回文本
.get("href")根据属性拿值
xpath模块
导入模块

pip install -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com/pypi/simple/ bs4
from lxml import etree

方法
etree.xpath(text, parser=None, base_url=None)

参数	说明	备注
第一个/	根节点
/	子节点
text()	拿文本数据
//	子孙后代
*	通配符
[@属性=值]	属性筛选
@属性	拿到属性

练习

posted on 2024-10-21 10:18 珂k 阅读(55) 评论(0) 收藏举报

刷新页面返回顶部

kezz

bs4、xpath工具使用

导航

公告