[Python]快速解析数据库视图XML配置获取数据库字段说明
在当前项目中,我收到数据库开发人员提供的XML视图文件,其中包含了表信息; 但这些信息混杂在大量的UI配置中,很难阅读,于是我决定用Python来编写一个简单的程序来进行 XML 解析,将所需的数据字段信息转换成CSV格式,再导入到Excel中(耗时2小时),有如下几点技术体会:
- Python中采用minidom进行解析时,其XML文件必须是UTF-8编码格式,否则会出错。在进行解析前要先进行编码转换工作;
- Python中的DOM节点Node值获取必须要用firstChild.nodeValue形式,不能直接用nodeValue来获取;
- Python中解析后的String值都是UTF-8格式,所以其File IO操作必须用codecs方式;
- Python编程时逐步从逐行解释方式过渡到OPP方式,这样虽然步骤比较多,但调试方便;
参考代码如下:
class dbviewxmladapter:
"""
"""
def __init__(self):
self._version = "0.1"
self._path = "e:\\Temp\\Work"
self._files = []
self._lines = []
def setPath( self, path ):
self._path = path
def addFile( self, filename ):
self._files.append( filename )
def getNodeValue( self, element, tagName ):
return element.getElementsByTagName( tagName )[0].firstChild.nodeValue
def getSubNodeValue( self, element, tagName ):
subNode = element.getElementsByTagName( 'BizObjPropertyDBInfo' )[0]
return subNode.getElementsByTagName( tagName )[0].firstChild.nodeValue
def parseXml( self ):
import xml.dom.minidom
try:
for file in self._files:
filename = self._path + '\\' + file
print filename
f = open( filename )
doc = xml.dom.minidom.parse( f )
viewEName = doc.getElementsByTagName('BizObject')[0].getElementsByTagName('EName')[0].firstChild.nodeValue
viewCName = doc.getElementsByTagName('BizObject')[0].getElementsByTagName('CName')[0].firstChild.nodeValue
line = viewEName + ', , , , , , ' + viewCName
self._lines.append( line )
items = doc.getElementsByTagName( 'BizObjProperty' )
for item in items:
EName = self.getNodeValue( item, 'EName' )
CName = self.getNodeValue( item, 'CName' )
Description = self.getNodeValue( item, 'Description' )
Type = self.getSubNodeValue( item, 'Type' )
Length = self.getSubNodeValue( item, 'Length' )
Size = self.getSubNodeValue( item, 'Size')
IsPK = self.getSubNodeValue( item, 'IsPK' ) == '1'
IsNullable = self.getSubNodeValue( item, 'IsNullable' ) == '1'
line = EName + ',' + Type + ',' + Length + ',' + Size + ',' + str(IsPK) + ', ' + str(IsNullable) + ',' + CName + ':' + Description
self._lines.append( line )
finally:
print "over"
def printLines( self ):
for line in self._lines:
print line
def writeToCSVFile( self, outfilename ):
import codecs
filename = self._path + '\\' + outfilename
f = codecs.open( filename,'w','utf-8' )
for line in self._lines:
f.write( line + '\n' )
f.flush()
f.close()
# TestSuite Scripts
aObject = dbviewxmladapter()
for i in range(5):
filename = str(i+1) + ".xml"
aObject.addFile( filename )
#aObject.addFile("5.xml")
aObject.parseXml()
#aObject.printLines()
aObject.writeToCSVFile( "all.csv" )

浙公网安备 33010602011771号