2011-04-03 66 views

回答

5

lxml比BeautifulSoup快很多倍,所以你可能想要使用它。

from lxml.html import parse 
doc = parse('http://python.org').getroot() 
for row in doc.cssselect('table > tr'): 
    for cell in row.cssselect('td:nth-child(3)'): 
     print cell.text_content() 

或者,而不是循环:

rows = [ row for row in doc.cssselect('table > tr') ] 
cells = [ cell.text_content() for cell in rows.cssselect('td:nth-child(3)') ] 
print cells 
相关问题