xml 模块
XML 简介
XML 指可扩展标记语言(eXtensible Markup Language)。
XML 被设计用来传输和存储数据。
HTML 被设计用来显示数据。
XML是一套定义语义标记的规则,这些标记将文档分成许多部件并对这些部件加以标识。
它也是元标记语言,即定义了用于定义其他与特定领域有关的、语义的、结构化的标记语言的句法语言。
总的来说:XML是实现不同语言或程序之间进行数据交换的协议;
<collection shelf="New Arrivals"> <movie title="Enemy Behind"> <type>War, Thriller</type> <format>DVD</format> <year>2003</year> <rating>PG</rating> <stars>10</stars> <description>Talk about a US-Japan war</description> </movie> <movie title="Transformers"> <type>Anime, Science Fiction</type> <format>DVD</format> <year>1989</year> <rating>R</rating> <stars>8</stars> <description>A schientific fiction</description> </movie> <movie title="Trigun"> <type>Anime, Action</type> <format>DVD</format> <episodes>4</episodes> <rating>PG</rating> <stars>10</stars> <description>Vash the Stampede!</description> </movie> <movie title="Ishtar"> <type>Comedy</type> <format>VHS</format> <rating>PG</rating> <stars>2</stars> <description>Viewable boredom</description> </movie> </collection>
1、解析xml
1)利用ElementTree.XML将字符串解析成xml对象
1 [root@js-93 xml_test]# cat ex03.py 2 #!/usr/bin/env python3 3 #coding:utf-8 4 5 import xml.etree.ElementTree as ET 6 7 #打开文件,读取XML内容 8 info_xml = open('test.xml','r').read() 9 10 #将字符串解析成xml 特殊对象,root代指xml文件的根节点 11 root = ET.XML(info_xml)
2)利用ElementTree.parse将文件直接解析成xml对象
3 操作
例1:遍历所有内容,遍历指定节点
1 [root@js-93 xml_test]# cat ex04.py 2 #!/usr/bin/env python3 3 #coding:utf-8 4 import xml.etree.ElementTree as ET 5 #直接解析xml文件 6 tree = ET.parse("test.xml") 7 #获取xml文件的根节点 8 root = tree.getroot() #获取xml文件的根节点 9 print('--1--'+root.tag) #打印根节点标签 10 11 #遍历xml文档 12 print('--2--') 13 for line in root: #遍历XMl文件的第二层 14 print(line.tag,line.attrib) #tag:当前节点标签名,attrib:当前节点的属性 15 for i in line: 16 print(i.tag,i.text) #text:当前节点的内容;打印第二层的标签名和标签内容 17 #遍历year节点 18 print('--3--') 19 for node in root.iter('year'): #只遍历含有year的节点 20 print(node.tag,node.text)
[root@js-93 xml_test]# python3 ex04.py --1--collection --2-- movie {'title': 'Enemy Behind'} type War, Thriller format DVD year 2003 rating PG stars 10 description Talk about a US-Japan war movie {'title': 'Transformers'} type Anime, Science Fiction format DVD year 1989 rating R stars 8 description A schientific fiction movie {'title': 'Trigun'} type Anime, Action format DVD episodes 4 rating PG stars 10 description Vash the Stampede! movie {'title': 'Ishtar'} type Comedy format VHS rating PG stars 2 description Viewable boredom --3-- year 2003 year 1989
例2:修改,删除 node
[root@js-93 xml_test]# cat ex05.py #!/usr/bin/env python3 #coding:utf-8 import xml.etree.ElementTree as ET tree = ET.parse('test.xml') root = tree.getroot() #修改 for node in root.iter('year'): #iter:在当前节点的子孙中根据节点名称寻找所以指定的节点,并返回一个迭代器(可以被for循环) new_year = int(node.text) + 1 node.text = str(new_year) node.set("updated","yes") #set:为当前节点设置属性值 #del node.attrib['update'] #可以删除属性
tree.write("test.xml") #删除 for line in root.findall('movie'): stars = int(line.find('stars').text) if stars < 8: root.remove(line) tree.write('update.xml')
4 创建xml文档
[root@js-93 xml_test]# cat ex06.py #!/usr/bin/env python3 #coding:utf-8 from xml.etree import ElementTree as ET root = ET.Element('famliy') son1 = ET.Element('son',{'name':'儿1'}) son2 = ET.Element('son',{'name':'儿2'}) grandson1 = ET.Element('grandson',{'name':'儿11'}) grandson2 = ET.Element('grandson',{'name':'儿12'}) son1.append(grandson1) son1.append(grandson2) root.append(son1) root.append(son2) tree = ET.ElementTree(root) tree.write('family.xml',encoding='utf-8',short_empty_elements=False)
[root@js-93 xml_test]# cat ex07.py #!/usr/bin/env python3 #coding:utf-8 from xml.etree import ElementTree as ET root = ET.Element('famliy') son1 = root.makeelement('son',{'name':'儿1'}) #makeelement:创建一个新节点 son2 = root.makeelement('son',{'name':'儿2'}) grandson1 = son1.makeelement('grandson',{'name':'儿11'}) grandson2 = son1.makeelement('grandson',{'name':'儿12'}) son1.append(grandson1) #为当前节点追加一个子节点 son1.append(grandson2) root.append(son1) root.append(son2) tree = ET.ElementTree(root) tree.write('family.xml',encoding='utf-8',short_empty_elements=False)
方法1和2执行结果:
[root@js-93 xml_test]# cat family.xml <famliy><son name="儿1"><grandson name="儿11"></grandson><grandson name="儿12"></grandson></son><son name="儿2"></son></famliy>
[root@js-93 xml_test]# cat ex08.py #!/usr/bin/env python3 #coding:utf-8 from xml.etree import ElementTree as ET root = ET.Element('famliy') son1 = ET.SubElement(root,"son",attrib={"name":'儿1'}) son2 = ET.SubElement(root,"son",attrib={"name":'儿2'}) grandson1 = ET.SubElement(son1,"age",attrib={'name':'儿12'}) grandson1.text = '孙子' et = ET.ElementTree(root) et.write("family_new.xml",encoding="utf-8",xml_declaration=True,short_empty_elements=False)
方法3执行结果: [root@js-93 xml_test]# cat family_new.xml <?xml version='1.0' encoding='utf-8'?> <famliy><son name="儿1"><age name="儿12">孙子</age></son><son name="儿2"></son></famliy>
浙公网安备 33010602011771号