16-xpath入门解析01

# xpath是在xml文档中搜索内容的一门语言
# html是xml的一个子集
# 安装lxml模块   pip install lxml
# xpath解析

from lxml import html
etree = html.etree

xml = """
<book>
    <id>1</id>
    <name>野花遍地香</name>
    <price>1.23</price>
    <nick>臭豆腐</nick>
    <author>
        <nick id="10086">周大强</nick>
        <nick id="10010">周芷若</nick>
        <nick class="joy">周杰伦</nick>
        <nick class="jolin">蔡依林</nick>
        <div>
            <nick>惹了</nick>
        </div>
    </author>
    
    <partner>
        <nick id="ppc">胖胖陈</nick>
        <nick id="ppbc">胖胖不陈</nick>
        
        <span>
            <nick id="ppc">胖胖陈111</nick>
            <nick id="ppbc">胖胖不陈111</nick>
        </span>
        
        <div>
            <nick id="ppc">胖胖陈222</nick>
            <nick id="ppbc">胖胖不陈222</nick>
            
            <div>
                <nick id="ppc">胖胖陈333</nick>
                <nick id="ppbc">胖胖不陈333</nick>
            </div>
        </div>
    </partner>
</book>
"""

tree = etree.XML(xml)
# result = tree.xpath("/book/name")  # /表示层级关系，第一个/代表是根节点
# result = tree.xpath("/book/name/text()")  # text()拿文本
# result = tree.xpath("/book/author/nick/text()")
# result = tree.xpath("/book/author//nick/text()")  # //代表后代，如儿子、孙子、重孙子等等
# result = tree.xpath("/book/partner/*/nick/text()")  # * 代表通配符 表示任意的节点
result = tree.xpath("/book//nick/text()")
print(result)
posted @ 2021-12-12 18:30 不是孩子了阅读(48) 评论(0) 收藏举报
刷新页面返回顶部
发量不减

16-xpath入门解析01

公告