2013-02-08 60 views
5

我使用etree通过xml文件递归。Python:避免在数组上嵌套循环

import xml.etree.ElementTree as etree 
tree = etree.parse('x.xml') 
root = tree.getroot() 
for child in root[0]: 
for child in child.getchildren(): 
     for child in child.getchildren(): 
      for child in child.getchildren(): 
       print(child.attrib) 

什么是在Python中避免这些嵌套for循环的惯用方式。

getchildren() ⇒ list of Element instances [#] 
    Returns all subelements. The elements are returned in document order. 

Returns: 
A list of subelements. 

我看到一些帖子在这么喜欢, Avoiding nested for loops 但并不直接转化为我所用。

谢谢。

+1

'itertools.product'是避免嵌套循环的好方法。为什么不能转化为您的使用? – 2013-02-08 20:13:29

+0

您是否特意要素4孩子的属性? – bogatron 2013-02-08 20:15:36

+0

抱歉,我并不是说itertools.product不适合我,但无法将该示例转换为像我这样的数组。我没有做太多的Python,但会尝试。 – bsr 2013-02-08 20:30:21

回答

3

如果你想获得那些n层次深的树,然后遍历它们的孩子,你可以这样做:

def childrenAtLevel(tree, n): 
    if n == 1: 
     for child in tree.getchildren(): 
      yield child 
    else: 
     for child in tree.getchildren(): 
      for e in childrenAtLevel(child, n-1): 
       yield e 

然后,为了获得元素四个层次深,您只需将说:

for e in childrenAtLevel(root, 4): 
    # do something with e 

或者,如果你想获得的所有叶子节点(即没有任何孩子自己的节点),你可以这样做:

def getLeafNodes(tree): 
    if len(tree) == 0: 
     yield tree 
    else: 
     for child in tree.getchildren(): 
      for leaf in getLeafNodes(child): 
       yield leaf 
2

itertools.chain.from_iterable将扁平化一层嵌套;您可以使用functools.reduce应用它ñ倍(Compressing "n"-time object member call):

from itertools import chain 
from functools import reduce 

for child in reduce(lambda x, _: chain.from_iterable(x), range(3), root): 
    print(child.attrib) 

注意getchildren是过时;迭代节点直接生成子节点。