我使用Python和ElementTree来解析XML文件。我希望能够列出包含所有CD信息的字典列表。稍后我可以使用此列表来收集信息,例如显示来自美国的CD的标题。下面的代码正在工作,但如果YEAR
标签不是CD的最后一个标签,则很容易被破坏。我怎样才能重写这段代码,使标签可以以任何顺序?在Python中使用元素树进行XML解析
from xml.etree.ElementTree import ElementTree
f = open("cd_catalog.xml")
tree = ElementTree()
tree.parse(f)
catalog = []
cd = {}
for node in tree.iter():
if node.tag != "CD" and node.tag != "CATALOG":
tagtext = (node.tag,node.text),
cd.update(tagtext)
if node.tag == "YEAR":
catalog.append(cd)
cd = {}
for cd in catalog:
if cd["COUNTRY"] == "USA":
print("The cd named {0} is from USA".format(cd["TITLE"]))
2项的XML文件:
<CATALOG>
<CD>
<TITLE>Empire Burlesque</TITLE>
<ARTIST>Bob Dylan</ARTIST>
<COUNTRY>USA</COUNTRY>
<COMPANY>Columbia</COMPANY>
<PRICE>10.90</PRICE>
<YEAR>1985</YEAR>
</CD>
<CD>
<TITLE>Hide your heart</TITLE>
<ARTIST>Bonnie Tyler</ARTIST>
<COUNTRY>UK</COUNTRY>
<COMPANY>CBS Records</COMPANY>
<PRICE>9.90</PRICE>
<YEAR>1988</YEAR>
</CD>
</CATALOG>