我给出了一个指向HTML页面的链接。如何打开它并使用其绝对XPath获取特定元素的内容。使用Python提取HTML页面元素的内容
from lxml import html
import requests
page = requests.get('http://www.professorpaddle.com/rivers/riverlist.asp')
tree = html.fromstring(page.content)
table_data=[]
temp_dict={}
temp = tree.xpath('//a[@class="pathm"]')
for i in temp:
link=i.attrib.get('href')
link="http://www.professorpaddle.com/rivers/"+link
temp_dict['name']=i.text
temp_dict['link']=link
print(link)
temp_page=requests.get(link)
temp_tree=html.fromstring(temp_page.content)
x=temp_tree.xpath('/html/body/element/table/tbody/tr[2]/td/table/tbody/tr/td/table[1]/tbody/tr[2]/td[3]/table/tbody/tr[3]/td[2]/font')
print(x)
break
你尝试的东西吗? – Dekel
是的,但我如何发布我的代码? – FibonacciCoder
选中此项:http://stackoverflow.com/editing-help – Dekel