我有一个HTML文档这样的:如何使用Nokogiri解析此HTML?
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
<title>Page Title</title>
<style type="text/css">
</style>
</head>
<body>
<div class="section">
<table>
<tr>
<td>this_is_what_i_want</td><td>this_is_what_i_want</
td><td>test</td><td>test</td>
</tr>
<tr>
<td>this_is_what_i_want</td><td>this_is_what_i_want</
td><td>test</td><td>test</td>
</tr>
<tr>
<td>this_is_what_i_want</td><td>this_is_what_i_want</
td><td>test</td><td>test</td>
</tr>
<tr>
<td>this_is_what_i_want</td><td>this_is_what_i_want</
td><td>test</td><td>test</td>
</tr>
</table>
</div>
<div class="section">
<table>
<tr>
<td>test</td><td>test</td><td>test</td><td>test</td>
</tr>
<tr>
<td>test</td><td>test</td><td>test</td><td>test</td>
</tr>
<tr>
<td>test</td><td>test</td><td>test</td><td>test</td>
</tr>
<tr>
<td>test</td><td>test</td><td>test</td><td>test</td>
</tr>
</table>
</div>
<div class="section">
<table>
<tr>
<td>this_is_what_i_want</td><td>this_is_what_i_want</
td><td>test</td><td>test</td>
</tr>
<tr>
<td>this_is_what_i_want</td><td>this_is_what_i_want</
td><td>test</td><td>test</td>
</tr>
<tr>
<td>this_is_what_i_want</td><td>this_is_what_i_want</
td><td>test</td><td>test</td>
</tr>
<tr>
<td>this_is_what_i_want</td><td>this_is_what_i_want</
td><td>test</td><td>test</td>
</tr>
</table>
</div>
</body>
</html>
我想在第一时间拿到的所有行头两个td
元素和 第三table
元素。如何得到这个结果?
注意,在连续两个td
元素有一定的关系,你不能把所有td
内容的相同方法。例如,如何连接连续两个td
元素的内容 ?