2016-11-19 70 views
2

我想保存指定的网站到服务器。没有结果试过file_get_contents,最后得到CURL。一些研究和测试结果后:无法克服的网站浏览器检查与PHP卷曲

$header=array('GET /1575051 HTTP/1.1', 
'Host: shakes.pro', 
'Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 
'Accept-Language:en-US,en;q=0.8', 
'Cache-Control:max-age=0', 
'Connection:keep-alive', 
'User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36', 
); 
set_time_limit(0); 
$fp = fopen ('./a.xml', 'w+'); 
$ch = curl_init('http://shakes.pro/'); 
$agent= 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36'; 

curl_setopt($ch, CURLOPT_HTTPHEADER,$header); 
curl_setopt($ch, CURLOPT_TIMEOUT, 50); 
curl_setopt($ch, CURLOPT_FILE, $fp); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true); 
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT,0); 
curl_setopt($ch, CURLOPT_HTTPHEADER,$header); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($ch, CURLOPT_VERBOSE, true); 
curl_setopt($ch, CURLOPT_USERAGENT, $agent); 

curl_exec($ch); 
curl_close($ch); 
fclose($fp); 

作品与我都检查过了其他网站,但是这一次我需要:) 测试了几个不同的服务器,同样的结果 任何想法?

+0

也许是因为'set_time_limit'设置为0,但我必须说,我不熟悉PHP的:) – georoot

+0

的libcurl我认为该网站有周期会被重定向。尝试将其打开到浏览器。我没有成功的回应。如果你想把它保存到CURLOPT_FILE中,则移除CURLOPT_RETURNTRANSFER标志。用set_time_limit玩 – Wizard

+0

没有给出任何结果。 – SpinLD

回答

0

如果目标网站无法找到自己的Cookie,它将重定向您。在这里,对我的作品的代码:

<?php 
$header=array(
    'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 
    'Accept-Language: en-US,en;q=0.8', 
    'Cache-Control: max-age=0', 
    'Connection: keep-alive', 
); 
$fp = fopen ('./a.xml', 'w+'); 
$ch = curl_init(); 
$agent= 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36'; 

curl_setopt($ch, CURLOPT_URL, 'http://shakes.pro/1575051'); 
curl_setopt($ch, CURLOPT_AUTOREFERER, false); 
curl_setopt($ch, CURLOPT_MAXREDIRS, 5); 
curl_setopt($ch, CURLOPT_HTTPHEADER,$header); 
curl_setopt($ch, CURLOPT_TIMEOUT, 50); 
curl_setopt($ch, CURLOPT_FILE, $fp); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); 

// this line added for cookies while =307 
curl_setopt($ch, CURLOPT_COOKIEJAR, './a.cookiejar'); 

// if this line enabled, you'll get the result into $result 
//curl_setopt($ch, CURLOPT_RETURNTRANSFER,true); 

curl_setopt($ch, CURLOPT_CONNECTTIMEOUT,0); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false); 
curl_setopt($ch, CURLOPT_VERBOSE, true); 
curl_setopt($ch, CURLOPT_USERAGENT, $agent); 

$result = curl_exec($ch); 
curl_close($ch); 
fclose($fp); 
+0

谢谢精灵,你救了我的一天!像魅力一样工作! – SpinLD