错误有条件etree LXML

我试图删除一切之间，如果是数之间66：错误有条件etree LXML

我得到以下错误：类型错误：类型的参数“NoneType”不是可迭代的...如果element.tag == element.text中的'answer'和'-66'：

这是什么问题？任何帮助？

#!/usr/local/bin/python2.7 
# -*- coding: UTF-8 -*- 

from lxml import etree 

planhtmlclear_utf=u""" 
<questionaire> 
<question> 
<questiontext>What's up?</questiontext> 
<answer></answer> 
</question> 
<question> 
<questiontext>Cool?</questiontext> 
<answer>-66</answer> 
</question> 
</questionaire> 

""" 

html = etree.fromstring(planhtmlclear_utf) 
questions = html.xpath('/questionaire/question') 
for question in questions: 
    for element in question.getchildren(): 
     if element.tag == 'answer' and '-66' in element.text: 
      html.xpath('/questionaire')[0].remove(question) 
print etree.tostring(html)

来源

2011-10-08 Jurudocs

element.text在某些迭代中似乎为无。该错误是说，它不能期待通过无为“-66”，因此检查element.text不无首是这样的：

html = etree.fromstring(planhtmlclear_utf) 
questions = html.xpath('/questionaire/question') 
for question in questions: 
    for element in question.getchildren(): 
        if element.tag == 'answer' and element.text and '-66' in element.text: 
            html.xpath('/questionaire')[0].remove(question) 
print etree.tostring(html)

行其在XML失败是<answer></answer>那里没有标签之间的文字。

编辑（对您的问题的第二部分关于合并标签）：

您可以使用BeautifulSoup这样的：

from lxml import etree 
import BeautifulSoup 

planhtmlclear_utf=u""" 
<questionaire> 
<question> 
<questiontext>What's up?</questiontext> 
<answer></answer> 
</question> 
<question> 
<questiontext>Cool?</questiontext> 
<answer>-66</answer> 
</question> 
</questionaire>""" 

html = etree.fromstring(planhtmlclear_utf) 
questions = html.xpath('/questionaire/question') 
for question in questions: 
    for element in question.getchildren(): 
        if element.tag == 'answer' and element.text and '-66' in element.text: 
            html.xpath('/questionaire')[0].remove(question) 

soup = BeautifulSoup.BeautifulStoneSoup(etree.tostring(html)) 
print soup.prettify()

打印：

<questionaire> 
<question> 
    <questiontext> 
    What's up? 
    </questiontext> 
    <answer> 
    </answer> 
</question> 
</questionaire>

这里是一个链接，你可以下载BeautifulSoup module。

或者，这个做了更紧凑的方式：

from lxml import etree 
import BeautifulSoup  

# abbreviating to reduce answer length... 
planhtmlclear_utf=u"<questionaire>.........</questionaire>" 

html = etree.fromstring(planhtmlclear_utf) 
[question.getparent().remove(question) for question in html.xpath('/questionaire/question[answer/text()="-66"]')] 
print BeautifulSoup.BeautifulStoneSoup(etree.tostring(html)).prettify()

来源

2011-10-08 13:37:16 chown

哇，真的有帮助！非常感谢！ – Jurudocs

随时@Jurudocs！乐于帮助。 – chown

也许你可以帮我一个进一步一步:-P现在我得到的输出：<？问卷调查> 这是怎么回事 .....这个答案没有完全显示......为什么？ – Jurudocs

，以检查是否element.text是None一种替代，可帮助您优化的XPath：

questions = html.xpath('/questionaire/question[answer/text()="-66"]') 
for question in questions: 
    question.getparent().remove(question)

括号[...]平均“这样的”。所以

question       # find all question elements 
[         # such that 
    answer       # it has an answer subelement 
    /text()      # whose text 
    =        # equals 
    "-66"       # "-66" 
]

来源

2011-10-08 14:04:28 unutbu

这解决了问题，他没有触及其他答案元素...与上述例子我得到的答案elemts切...但我不知道为什么......无论如何这个解决方案它的作品！ – Jurudocs

没有对不起......他正在削减空答案标签......为什么总是这样？ – Jurudocs

我不确定我是否理解这个问题。你的意思是''被缩短为''？没关系;它们是等价的。 – unutbu

错误有条件etree LXML

回答

相关问题