在php中提取html页面的内容

-2

有什么办法可以提取HTML页面的内容，该页面从<body>开始并以</body>结束于php。如果有人可以发布一些示例代码。在php中提取html页面的内容

2012-01-16 bharathi

查看众多网站抓取问题之一。 – Dunhamzzz 2012-01-16 10:10:33

[How to parse and process HTML with PHP？]（http://stackoverflow.com/questions/3577641/how-to-parse-and-process-html-with-php） – CodeCaster 2012-01-16 10:12:38

尝试PHP Simple HTML DOM Parser

$html = file_get_html('http://www.example.com/'); 
$body = $html->find('body');

来源

2012-01-16 10:10:54 NAVEED

你应该看看DOMDocument参考。

这个例子读取html文件，创建一个DOMDocument并得到身体标记：

libxml_use_internal_errors(true); 
$dom = new DOMDocument; 
$dom->loadHTMLFile('http://example.com'); 
libxml_use_internal_errors(false); 

$body = $dom->getElementsByTagName('body')->item(0); 

echo $body->textContent; // print all the text content in the body

您还应该检查以下资源：

DOM API Documentation
XPATH language specification

来源

2012-01-16 10:12:51 Cyclonecode

你可以也尝试使用基于strpos的非DOM解决方案功能：

$html = file_get_contents($url); 
$html = substr($html,stripos($html,'<body>')+6); 
$html = substr($html,0,strripos($html,'</body>'));

stripos是strpos不区分大小写的版本，strripos是strpos不区分大小写“最右边的位置”的版本。

希望它能帮助你！

来源

2015-10-01 21:29:02

在php中提取html页面的内容

回答

相关问题