BeautifulSoup如何从HTML表格的特定列中提取数据。我的代码是提取所有列

我有一个行和一些列的HTML表。我想从具有文本“Total”的列中提取数据，并从具有值“93”的列中提取数据。我只想提取这两列数据。我的代码是从所有列中提取数据。BeautifulSoup如何从HTML表格的特定列中提取数据。我的代码是提取所有列

E.g.我的输出是：

Total 
93 
93 
0 
0

我所需的输出是：

Total 93

我的代码是：

def extract_total_from_report_htmltestrunner(): 
    filename = (
    r"C:\test_runners 2 edit project\selenium_regression_test\TestReport\ClearCore_Automated_GUI_Regression_TestReport.html") 
    html_report_part = open(filename, 'r') 
    soup = BeautifulSoup(html_report_part, "html.parser") 
    tr_total_row = soup.find('tr', {'id': 'total_row'}) 
    tr_total_row.find(text=True, recursive=False) 
    print tr_total_row.text 
    return tr_total_row.text

的HTML片段是：

<table id='result_table'> 
    <tr id='total_row'> 
     <td>Total</td> 
     <td>93</td> 
     <td>93</td> 
     <td>0</td> 
     <td>0</td> 
     <td>&nbsp;</td> 
    </tr> 
</table>

如何提取物“Total”“93”并将其打印出来线？

感谢，里亚兹

来源

2016-10-28 Riaz Ladhani

您可以使用find_all()和片结果：

" ".join(td.get_text(strip=True) for td in tr_total_row.find_all("td")[:2])

来源

2016-10-28 13:38:45 alecxe

这是伟大的。谢谢你的帮助。 –

BeautifulSoup如何从HTML表格的特定列中提取数据。我的代码是提取所有列

回答

相关问题