我有一个PHP DOM对象 http://php.net/manual/en/class.domdocument.php如何使用PHP DOM对象提取一些内容?
难道仅仅可以显示从第三标签,并在该表中的第二个标签内容有问题?
/*** a new dom object ***/
$dom = new domDocument;
/*** load the html into the object ***/
@$dom->loadHTML($html);
/*** discard white space ***/
$dom->preserveWhiteSpace = false;
/*** the table by its tag name ***/
$tables = $dom->getElementsByTagName('table');
/*** get all rows from the table ***/
$rows = $tables->item(0)->getElementsByTagName('tr');
/*** loop over the table rows ***/
foreach ($rows as $row)
{
/*** get each column by tag name ***/
$cols = $row->getElementsByTagName('td');
/*** echo the values ***/
echo $cols->item(0)->nodeValue.'<br />';
echo $cols->item(1)->nodeValue.'<br />';
echo $cols->item(2)->nodeValue.'<br />';
echo $cols->item(3)->nodeValue.'<br />';
echo $cols->item(4)->nodeValue.'<br />';
echo $cols->item(5)->nodeValue.'<br />';
echo '<hr />';
}
编辑:
我得到这个错误:致命错误:在
<?php
/*** a new dom object ***/
$dom = new domDocument;
/*** load the html into the object ***/
@$dom->loadHTML('content.html');
/*** discard white space ***/
$dom->preserveWhiteSpace = false;
$xpath = new DOMXPath($dom);
$selected = $xpath->query('//table/tr/td[first()+1]');
echo $selected[0]->nodeValue;
?>
EDIT2无法使用类型的DOMNodeList的对象数组:
<?php
$output = file_get_contents('test.php');
/*** a new dom object ***/
$dom = new domDocument;
/*** load the html into the object ***/
@$dom->loadHTML($output);
/*** discard white space ***/
$dom->preserveWhiteSpace = false;
/*** the table by its tag name ***/
$tables = $dom->getElementsByTagName('table');//get all the tables
if($tables->length > 2) { //check there are more than 2
$thirdTable = $tables->item(2);
$cols = $thirdTable->getElementsByTagName('td');
/*** echo the values ***/
echo $cols->item(0)->nodeValue.'<br />';
echo $cols->item(1)->nodeValue.'<br />';
echo $cols->item(2)->nodeValue.'<br />';
echo $cols->item(3)->nodeValue.'<br />';
echo $cols->item(4)->nodeValue.'<br />';
echo $cols->item(5)->nodeValue.'<br />';
echo '<hr />';
}
?>
EDIT3 - 此代码仅显示来自第三个表格标签的内容。但它也只需要显示第三个表格中第二个tr标签的内容。
$html = file_get_contents('content.html');
/*** a new dom object ***/
$dom = new domDocument;
/*** load the html into the object ***/
@$dom->loadHTML($html);
/*** discard white space ***/
$dom->preserveWhiteSpace = false;
/*** the table by its tag name ***/
$tables = $dom->getElementsByTagName('table');
/*** get all rows from the table ***/
$rows = $tables->item(2)->getElementsByTagName('tr')->item(1);
/*** loop over the table rows ***/
foreach ($rows as $row)
{
/*** get each column by tag name ***/
$cols = $row->getElementsByTagName('td');
/*** echo the values ***/
echo $cols->item(0)->nodeValue.'<br />';
echo $cols->item(1)->nodeValue.'<br />';
echo $cols->item(2)->nodeValue.'<br />';
echo $cols->item(3)->nodeValue.'<br />';
echo $cols->item(4)->nodeValue.'<br />';
echo $cols->item(5)->nodeValue.'<br />';
echo '<hr />';
}
我在$ HTML变量HTML内容。 – user1273409 2012-03-16 07:24:56
所述第一语法错误,这是因为[],使用 - >项(0),而不是支架 – artragis 2012-03-16 18:53:15