您现在的位置：程式師世界 >> 編程語言 > >> 更多編程語言 >> Python

Detailed explanation of Python parsing XML file

編輯：Python

1、 How to parse simple xml file ？

Actual case ：

xml Is a very common markup language , It can provide a unified method to describe the structured data of the application ：

 <?xml version="1.0" encoding="utf-8" ?>
<data>
<country name="Liechtenstein">
<rank updated="yes">2</rank>
<year>2008</year>
<gdppc>141100</gdppc>
<neighbor name="Austria" direction="E"/>
<neighbor name="Switzerland" direction="W"/>
</country>
</data>

python How to parse in xml file ？

Solution ：

Use... From the standard library xml.etree.ElementTree, Among them parse The function can parse XML file .

2、 Code demonstration

（1） Use parse analysis XML file

from xml.etree.ElementTree import parse
f = open('demo.xml')
# The first 1 Parameters are input sources , Return to one ElementTree object
et = parse(f)
# Through the element tree (ElementTree) Get the root node
root = et.getroot()
print(root)
# View tab
print(root.tag)
# View the properties
print(root.attrib)
# Check the text , Remove the space
print(root.text.strip())
# Traverse the element tree
# Get the child elements of the node ,python3 in getchildren Abandoned
children = list(root)
print(children)
# Get the attributes of each child node element
for child in root:
print(child.get('name'))
'''
find、findall and iterfind Only for
The current element is its immediate child element , Cannot find grandchild element .
'''
# Look for child elements based on tags ,find Always find the first 1 Elements encountered
print(root.find('country'))
# findall Is to find all the elements
print(root.findall('country'))
# You don't need a list , Hope to be an iteratable object , Get a generator object
print(root.iterfind('country'))
for e in root.iterfind('country'):
print(e.get('name'))
# You can find it at any level rank label
# By default, no parameters are entered , All elements under the entire current node will be listed
print(list(root.iter()))
# Recursively search for the tag rank Child nodes of
print(list(root.iter('rank')))

（2） About findall Advanced usage of search

from xml.etree.ElementTree import parse
f = open('demo.xml')
# The first 1 Parameters are input sources , Return to one ElementTree object
et = parse(f)
# Through the element tree (ElementTree) Get the root node
root = et.getroot()
# * Can match all child, Just looking for root All grandchildren of
print(root.findall('country/*'))
# Find child elements at any level ,. The point is the current node ,.. Parent node
print(root.findall('.//rank'))
print(root.findall('.//rank/..'))
# @ The description contains an attribute ,[@attrib]
print(root.findall('country[@name]'))
# Specify the attribute as a specific value ,[@attrib='value']
print(root.findall('country[@name="Singapore"]'))
# Specifies that an element must contain a specified child element ,[tag]
print(root.findall('country[rank]'))
# The text of the specified element must be equal to a specific value ,[tag='text']
print(root.findall('country[rank="5"]'))
# Find multiple element paths to specify relative positions ,[position]
print(root.findall('country[1]'))
print(root.findall('country[2]'))
# last() Looking backwards for
print(root.findall('country[last()]'))
# Find the penultimate
print(root.findall('country[last()-1]'))