2012-01-01 71 views
0

问题是,我得到的内容的一些部分,但没有得到用户的评论。由Firebug我看到的内容,但当我检查源代码没有内容的HTML标签/没有相同的HTML标签。这里是我的代码:使用simple_html_dom&cURL,但没有得到页面的所有内容。我怎样才能得到?

<?php 
    //Headers 
    include('simple_html_dom.php'); 

function getPage($page, $redirect = 0, $cookie_file = '') 
{   
    $ch = curl_init(); 


    $headers = array("Content-type: application/json"); 
    curl_setopt($ch, CURLOPT_HTTPHEADER, $headers); 

    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0); 
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 1); 
    curl_setopt($ch, CURLOPT_HEADER, 0); 

    if($redirect) 
    { 
     curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);  
    } 

    curl_setopt($ch, CURLOPT_URL, $page); 

    if($cookie_file != '') { 
    curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file); 
    curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file); 
    } 

    curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.6) Gecko/20060728 Firefox/1.5.0.6'); 

    $return = curl_exec($ch); //Mozilla/4.0 (compatible;) 

    curl_close($ch); 

    return $return; 

}//EO Fn 

//Source 
$url = 'http://www.vitals.com/doctor/profile/1982660171/reviews/1982660171'; 


//Parsing ... 
$contents = getPage($url, 1, 'cookies.txt'); 

$html = str_get_html($contents); 
//Output 
echo $html->outertext; 
?> 

谁能帮我 - 我应该做的就是整个页面,这样我可以抓住的评论? enter code here

+2

我敢打赌,这些意见都使用Ajax增加。有没有简单的解决方法 - 再加上你可能[不应该首先抓住那个网站。](http://www.vitals.com/termsofuse) – 2012-01-01 17:01:33

回答

0

它们只是作为JSON存储在朝向页面顶部的<script>块中。用RegEx或Simple HTML DOM解析出来,并通过json_decode运行它。

var json = {"provider":{"id":"1982660171","display_name":"Stephen R Guy, MD","last_name":"Guy","first_name":"Stephen","middle_name":"Russell","master_name":"Stephen_Guy","degree_types":"MD","familiar_name":"Stephen","years_experience":"27","birth_year":"1956","birth_month":"5","birth_day":"23","gender":"M","is_limited":"false","url_deep":"http:\/\/www.vitals.com\/doctor\/profile\/1982660171\/Stephen_Guy","url_public":"http:\/\/www.vitals.com\/doctors\/Dr_Stephen_Guy.html","status_code":"A","client_ids":"1","quality_indicator_set":[{"type":"quality-indicator\/consumer-feedback","count":"2","suboverall_set":[{"name_short":"Promptness","overall":"3"},{"name_short":"Courteous Staff","overall":"4"},{"name_short":"Bedside Manner","overall":"4"},{"name_short":"Spends Time with Me","overall":"4"},{"name_short":"Follow Up","overall":"4"}],"name":"Consumer Reviews","overall":"4.0","measure_set":[{"feedback_response_id":"1756185","input_source_ids":"{0}","date":"1301544000","value":"4","scale":{"best":"1","worst":"4"},"review":{"type":"review\/consumer","comment":"I will never birth with another dr. Granted that's not saying much as I don't like dr's but I actually find him as valuable as the midwives who I adore. I liked Horlacher but when Kitty left I followed the midwives and then followed again....Dr. Guy is GREAT. I honestly don't know who I'd rather support me at my birth; Margie and Lisa or Dr. Guy. ....I wonder if I can just get all of them.Guy's great. Know what you want. Tell him. Be strong and he'll support you.I give him 10 stars. Oh...my baby's 3 years old now. He's GREAT! ","date":"1301544000"},"sub_measure":[{"name":"Waiting time during a visit","name_short":"Promptness","value":"3","scale":{"best":"4","worst":"1"}},{"name":"Courtesy and professionalism of office staff ","name_short":"Courteous Staff","value":"4","scale":{"best":"4","worst":"1"}},{"name":"Bedside manner (caring)","name_short":"Bedside Manner","value":"4","scale":{"best":"4","worst":"1"}},{"name":"Spending enough time with me","name_short":"Spends Time with Me","value":"4","scale":{"best":"4","worst":"1"}},{"name":"Following up as needed after my visit","name_short":"Follow Up","value":"4","scale":{"best":"4","worst":"1"}}]},{"feedback_response_id":"420734","input_source_ids":"{76}","link":"http:\/\/local.yahoo.com\/info-15826842-guy-stephen-r-md-university-women-s-health-center-dayton","date":"1142398800","value":"4","scale":{"best":"1","worst":"4"},"review":{"type":"review\/consumer","comment":"Excellent Doctor: I really like going to this office. They are truely down to earth people and talk my \"non-medical\" language. I have been using thier office since 1997 and they have seen me through 2 premature pregnancies!","date":"1142398800"}}],"wait_time":"50"}]}}; 

但同样,请确保您有权限做到这一点...

+0

我只是在学习刮刮 - 在这个也很新领域。简单的网页,我可以。您能否给我写一些代码,以便如何通过简单的HTML DOM或RegEx解析此“var json = {...”。 – user1125230 2012-01-01 17:15:58