与XPath提取变量的值在python

我有一个简单的Python脚本，如：与XPath提取变量的值在python

#!/usr/bin/python 
import requests 
from lxml import html 
response = requests.get('http://site.ir/') 
out=response.content 
tree = html.fromstring(open(out).read()) 
print [e.text_content() for e in tree.xpath('//div[class="group"]/div[class="groupinfo"]/a/text()')]

我曾经为了XPath来获取标记a的价值，你可以从下面的图片查看... enter image description here 但输出样本不是我所期望的。

UPDATE 我也有以下错误：

Traceback (most recent call last): 
    File "p.py", line 7, in <module> 
    tree = html.fromstring(open(out).read()) 
IOError: [Errno 36] File name too long: '\n<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" ....

来源

2014-09-25 MLSC

你需要把@在属性名的开头以解决XPath的属性：

//div[@class="group"]/div[@class="groupinfo"]/a/text()

来源

2014-09-25 12:02:42 har07

谢谢......但我有错误..请参阅更新 – MLSC 2014-09-25 12:05:23

在您的答案，甚至没有我有这个错误...谢谢... – MLSC 2014-09-25 12:07:38

我不是蟒蛇专家，但似乎你对待HTML内容作为文件名。尝试直接传递HTML：'tree = html.fromstring（out）' – har07 2014-09-25 13:28:36

与XPath提取变量的值在python

回答

相关问题