我需要在Python的lxml的语句提取元标记值

我需要帮助解决这个LXML声明提取帮助：在 http://www.yfrog.com/9d1truj 我需要在Python的lxml的语句提取元标记值

#This doesn't work! 

# <link rel="image_src" href="http://img337.yfrog.com/img337/5023/1tru.jpg" /> 
def extract_imageurl(self, doc): 
    try: 
     self.url, = doc.xpath('//head//link[@rel="image_src"][1]/@href') 
    except ValueError: 
     self.url = "Error"

感谢

头部分 http://www.etc../1tru.jpg链接

来源

2011-01-20 user407601

In [32]: doc.xpath('//head/link[@rel="image_src"]/@href')[0] 
Out[32]: 'http://img337.yfrog.com/img337/5023/1tru.jpg'

通知xpath返回节点列表：

In [25]: doc.xpath('//head/link') 
Out[25]: [<Element link at 9c94c5c>, <Element link at 9c94b6c>]

指定[@rel="image_src"]后，列表中只有一个节点。你可以用[0]挑选节点后调用xpath。

In [29]: doc.xpath('//head/link[@rel="image_src"]')[0] 
Out[29]: <Element link at 9c94c5c>

import lxml.html as lh 
import urllib2 

url=r'http://www.yfrog.com/9d1truj' 
doc=lh.parse(urllib2.urlopen(url)) 
link=doc.xpath('//head/link[@rel="image_src"]/@href')[0] 
print(link) 
# http://img337.yfrog.com/img337/5023/1tru.jpg

来源

2011-01-20 13:19:11 unutbu

我得到这个错误：文件 “yfrogparser.py”，线路101，在extract_imageurl self.url，= doc.xpath（'//头/链接[@相对=” image_src“]'）[0] IndexError：列表索引超出范围 – user407601 2011-01-20 13:44:14

我需要在Python的lxml的语句提取元标记值

回答

相关问题