-1
工作我使用这个代码来获得一个URL的内容进入: -让网页内容不适用于某些链接
class MetaTagParser
{
public $metadata;
private $html;
private $url;
public function __construct($url)
{
$this->url=$url;
$this->html= $this->file_get_contents_curl();
$this->set_title();
$this->set_meta_properties();
}
public function file_get_contents_curl()
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $this->url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
public function set_title()
{
$doc = new DOMDocument();
@$doc->loadHTML($this->html);
$nodes = $doc->getElementsByTagName('title');
$this->metadata['title'] = $nodes->item(0)->nodeValue;
}
这个类适用于某些网页,但对于像这样的一些网址 - http://www.dnaindia.com/india/report_in-a-first-upa-govt-tweets-the-press_1745346 当我尝试获取数据我得到这个错误: - “警告:get_meta_tags(http://www.dnaindia.com/india/report_in-a-first-upa-govt-tweets-the-press_1745346):未能打开流:HTTP请求失败!HTTP/1.1 403第52行禁止C:\ xampp \ htdocs \ prac \ index.php“
它不工作,为什么t他正在发生?
该网站不喜欢你刮。 –
但是当这个链接在Facebook上发布时,它很容易从网页中提取内容.... – Manish