2017-07-31 155 views
1

我试图从供应商提供的XML文件中提取帐户详细信息。解析XML中的名称/值对

我有提供XML文件,比如一个供应商:

<Accounts> 
    <Account> 
    <AccountNumber>1234567</AccountNumber> 
    <Balance>$200.00</Balance> 
    </Account> 
    <Account> 
    ... 
    </Account> 
</Accounts> 

而且我可以分析这很容易使用python:

mytree = et.parse(xml_path) 
myroot = mytree.getroot() 

for acc in charges_root.findall('Account'): 
    acctnum = acc.find('AccountNumber').text 
    balance = acc.find('Balance').text 
    print(acctnum, balance) 

,其输出是这样的:

然而,另一个供应商提供的XML文件更像名称/值对,我不确定如何解决ily访问该数据。它不工作与上面相同的方式:

<Accounts> 
    <Account> 
    <field name='AccountNumber' value='1234567' /> 
    <field name='Balance' value='$200.00' /> 
    </Account> 
    <Account> 
    ... 
    </Account> 
</Accounts> 

到目前为止,我已经得到了这一点,但希望能够单独地享受到值:

mytree = et.parse(xml_path) 
myroot = mytree.getroot() 

for field in myroot.findall('Account'): 
    for line in field: 
     print(line.attrib) 

,输出的东西像:

{'name': 'AccountNumber', 'value': '1234567'} 
{'name': 'Balance', 'value': '$200.00'} 

所以我的问题是这样的 - 我怎样才能访问数值并赋值给变量(基于name),这样我可以让其他地方的脚本使用它们,像我有在第一个例子中使用acctnumbalance

回答

1

field填充新的数据结构(如dict)当您迭代,而不是仅仅丢弃:

account_d = {} 
for field in myroot.findall('Account'): 
    for line in field: 
     account_d[line.attrib['name']] = line.attrib['value'] 

    # account_d should now be: 
    # { 'AccountNumber': '1234567', 'Balance': '$200.00' } 

您可以使用列表/元组的列表太:

account_a = [] 
for field in myroot.findall('Account'): 
    for line in field: 
     account_d.append(line.attrib['name'], line.attrib['value']) 

    # account_a should now be: 
    # [('AccountNumber', '1234567'), ('Balance', '$200.00')] 
0

的ElementTree 1。3具有特定属性的定位节点的能力:

from xml.etree import ElementTree as et 

data = '''\ 
<Accounts> 
    <Account> 
    <field name='AccountNumber' value='1234567' /> 
    <field name='Balance' value='$200.00' /> 
    </Account> 
    <Account> 
    <field name='AccountNumber' value='9999999' /> 
    <field name='Balance' value='$300.00' /> 
    </Account> 
</Accounts>''' 

tree = et.fromstring(data) 

for acc in tree.iterfind('Account'): 
    acctnum = acc.find("field[@name='AccountNumber']").attrib['value'] 
    balance = acc.find("field[@name='Balance']").attrib['value'] 
    print(acctnum,balance) 
1234567 $200.00 
9999999 $300.00 
0

您可以通过收集所有Account元素的field属性转换成字典,然后使用其中的信息需要做到这一点:

accounts.xml样本输入文件:

<?xml version="1.0"?> 
<Accounts> 
    <Account> 
    <field name='AccountNumber' value='1234567' /> 
    <field name='Balance' value='$200.00' /> 
    </Account> 
    <Account> 
    <field name='AccountNumber' value='89' /> 
    <field name='Balance' value='$100.00' /> 
    </Account> 
</Accounts> 

代码:

import xml.etree.ElementTree as et 

xml_path = 'accounts.xml' 
mytree = et.parse(xml_path) 
myroot = mytree.getroot() 

for acct in myroot.findall('Account'): 
    info = {field.attrib['name']: field.attrib['value'] 
       for field in acct.findall('field')} 
    acctnum, balance = info['AccountNumber'], info['Balance'] 
    print(acctnum, balance) 

结果:

1234567 $200.00 
89$100.00 
0

问题:我怎样才能访问值,并将其赋值给变量(基于名称)

将所有帐户转换为Dict [帐户编号]的Dict [fi场。
属性name成为dict键:

Accounts = {} 
for account in root.findall('Account'): 
    fields = {} 
    for field in account.findall('field'): 
     fields[field.attrib['name']] = field.attrib['value'] 

    print('{a[AccountNumber]} {a[Balance]}'.format(a=fields)) 
    Accounts[fields['AccountNumber']] = fields 

print(Accounts) 

输出

1234567 $200.00 
9999999 $300.00 
{'9999999': {'AccountNumber': '9999999', 'Balance': '$300.00'}, '1234567': {'AccountNumber': '1234567', 'Balance': '$200.00'}} 

测试与Python:3.4.2