Python Lxml（objectify）：检查标签是否存在

我需要检查某个标签是否存在于xml文件中。Python Lxml（objectify）：检查标签是否存在

例如，我想看看在这个片段中存在标签：

<main> 
     <elem1/> 
     <elem2>Hi</elem2> 
     <elem3/> 
     ... 
</main>

目前这个样子，我使用的是丑陋的黑客攻击，错误检查，：

try: 
    if root.elem1.tag: 
     foo = elem1 
except AttributeError: 
    foo = "error finding elem1"

我也想要自定义字符串，如果它无法找到节点（即“无法找到-tagname-”）。

我必须检查一长串变量，我不想重复代码100次。

有什么建议吗？

编辑：

下面是实际的XML文件的剪断：

<main> 
<asset name="Virtual Dvaered Unpresence"> 
    <virtual/> 
    <presence> 
    <faction>Dvaered</faction> 
    <value>-1000.000000</value> 
    <range>0</range> 
    </presence> 
</asset> 
<asset name="Virtual Empire Small"> 
    <virtual/> 
    <presence> 
    <faction>Empire</faction> 
    <value>100.000000</value> 
    <range>2</range> 
    </presence> 
</asset> 
</main>

我要检查标签是否存在，如果是的话，得到的内容。

编辑编辑：好吧，我要结合两个答案，但我只能投一个答案。抱歉。

编辑3：有关XPath位置相关的问题：Python lxml (objectify): Xpath troubles

来源

2011-03-22 Biosci3c

假设你想elem2时的价值，你可以使用XPath找到它。

tree = etree.parse(StringIO(htmlString), etree.HTMLParser()).getroot() 
youWantValue = tree.xpath('/main/elem2')[0].text

来源

2011-03-22 04:17:47 Daniel

如果节点不存在，会发生什么？它会给出一个错误，还是一个空白值？ – Biosci3c 2011-03-22 05:44:49

@ Biosci3c该具体示例给出了一个错误，到'[0]'试图解引用由xpath调用返回的第一个值，如果在解引用之前检查列表是否为空，则另一方面，您将进行一个没有错误的测试。顺便说一下，我发现这是所给出的最佳实践答案。 – 2011-03-22 11:45:45

好吧，我喜欢XPATH的建议，所以我也会使用它。顺便说一句，我认为你错过了在顶部末尾的右括号线。 – Biosci3c 2011-03-28 02:39:47

如果你的文件往往是比较短的，你可以遍历的<main>所有的孩子寻找标签符合变量名的设定：

tree = lxml.etree.fromstring(DATA) 
NAMES = set(['elem1', 'elem3']) 
for node in tree.iterchildren(): 
    if node.tag in NAMES: 
     print 'found', node.tag

或者你可以同时搜索每个变量名之一：

for tag in ('elem1', 'elem3'): 
    if tree.find(tag) is not None: 
     print 'found', tag

来源

2011-03-22 01:51:44 samplebias

我一起工作的文件是相当长的。我将在这个问题中提一下它。 – Biosci3c 2011-03-22 01:58:49

另外，是建立搜索范围的第一行吗？ – Biosci3c 2011-03-22 02:03:34

编辑：样本文件的更新答案。

我假设你想搜索每个资产的某些标签。如果是的话，下面的工作对我来说：

import lxml.objectify 

# Parse the file. 
tree = lxml.objectify.parse('sample.xml') 
root = tree.getroot() 

# Which elements to find. 
to_find = set(['presence/faction', 'presence/value', 'fake']) 

# Go through each asset in the document. 
for asset in root.findall('asset'): 
    # Check for each element. 
    for name in to_find: 
     node = asset.find(name) 
     if node is not None: 
      print 'Found %s, its value is %s' % (name, node) 
     else: 
      print 'Unable to find %s' % name

产量为：

Found presence/value, its value is -1000.0 
Found presence/faction, its value is Dvaered 
Unable to find fake 
Found presence/value, its value is 100.0 
Found presence/faction, its value is Empire 
Unable to find fake

来源

2011-03-22 02:03:35 Blair

这看起来会很好。当我有机会时，我会尝试。只是为了澄清，你是否使用set（）和列表作为参数？ – Biosci3c 2011-03-22 05:41:40

是的。构造函数需要一个迭代器来给出集合中的初始条目。有关详细信息，请参见[文档]（http://docs.python.org/library/stdtypes.html#set）。 – Blair 2011-03-22 23:01:19

好的，有一个问题。我如何使这个赋值给特定的变量（即var_fac = presence/faction，var_value = presence/value？ – Biosci3c 2011-03-27 22:24:02

hasattr()作品这样的：

if hasattr(root, 'elem1'): 
    foo = root.elem1

来源

2012-01-09 09:08:02

这是我喜欢的答案。它仍然很难看，但那是Python的错，而不是海报。我只是想检查是否存在孩子，而不是启动一个完整的xpath处理器。 – odigity 2013-05-29 20:03:43

请注意，内部hasattr通过调用getattr和捕获异常来工作，所以它和内部一样难看（至少是我上次检查的时候）:) – 2015-02-24 06:21:55

Python Lxml（objectify）：检查标签是否存在

回答

相关问题