2016-05-14 70 views
-1

我从一个HTML页面下面的代码刮数据:如何重新编排刮阵列

<?php 
$url = 'http://www.atletiek.co.za/atletiek.co.za/uitslae/2016ASASASeniors/160415F012.htm'; 
$handle = curl_init($url); 
curl_setopt($handle, CURLOPT_RETURNTRANSFER, true); 
$html = curl_exec($handle); 
libxml_use_internal_errors(true); // Prevent HTML errors from displaying 
$doc = new DOMDocument(); 
$doc->loadHTML($html); // get the DOM 

$xpath = new DOMXPath($doc); // start a new xPath on our DOM Object 
$preBlock = $xpath->query('//pre'); // find all pre (we only got one here) 

// get the first of all the pre objects 
// get the 'inner value' 
// split them by newlines 
$preBlockString = explode("\n",$preBlock->item(0)->nodeValue); 
$startResultBlock = false; 
$i = 0; 


// traverse all rows 
foreach ($preBlockString as $line){ 
    // if we found the 'Name' marker within the last row start fetching the results 
    if($startResultBlock){ 
     $result = explode(' ', $line); 
     // kill all empty entries (originating from all the space characters) 
     foreach ($result as $key => $value) if (empty($value)) unset($result[$key]); 
     $results[] = $result; 
     // my first idea to use list does not work because of all the space characters 
     // list($results[$i]['number'], $results[$i]['name'], $results[$i]['age'], $results[$i]['team'], $results[$i]['finals'], $results[$i]['wind'], $results[$i]['points']) = explode(" ", $line); 
     $i++; 
    } 

    // if we found the word 'Name' we set a marker for the upcoming rows 
    if(trim($line) == 'Finals'){ 
     $startResultBlock = true; 

    } 

} 

var_dump($results); 
?> 

输出看起来是这样的: array(43) { [0]=> array(7) { [2]=> string(1) "1" [3]=> string(7) "Stephen" [4]=> string(6) "Mokoka" [16]=> string(2) "31" [17]=> string(3) "Agn" [36]=> string(8) "13:40.81" [40]=> string(1) "8" } [1]=> array(7) { [2]=> string(1) "2" [3]=> string(5) "Elroy" [4]=> string(6) "Gelant" [18]=> string(2) "30" [19]=> string(4) "Acnw" [37]=> string(8) "13:43.43" [41]=> string(1) "7" } [2]=> array(7) { [2]=> string(1) "3" [3]=> string(8) "Sibusiso" [4]=> string(5) "Nzima" [16]=> string(2) "30" [17]=> string(3) "Cga" [36]=> string(8) "13:46.73" [40]=> string(1) "6" }

我想所有的重编号,使其显示像这样:

array(43) { [0]=> array(7) { [2]=> string(1) "1" [3]=> string(7) "Stephen" [4]=> string(6) "Mokoka" [5]=> string(2) "31" [6]=> string(3) "Agn" [7]=> string(8) "13:40.81" [8]=> string(1) "8" } [1]=> array(7) { [2]=> string(1) "2" [3]=> string(5) "Elroy" [4]=> string(6) "Gelant" [5]=> string(2) "30" [6]=> string(4) "Acnw" [7]=> string(8) "13:43.43" [8]=> string(1) "7" } [2]=> array(7) { [2]=> string(1) "3" [3]=> string(8) "Sibusiso" [4]=> string(5) "Nzima" [5]=> string(2) "30" [6]=> string(3) "Cga" [7]=> string(8) "13:46.73" [8]=> string(1) "6" }

我曾尝试过各种东西,但它一直踢出来的数据就像前充足的1.有人有任何想法,我可以重新编号吗?或者,如果它可以从0/1开始并按顺序分配一个新号码。

回答

1
<?php 
    $url = 'http://www.atletiek.co.za/atletiek.co.za/uitslae/2016ASASASeniors/160415F012.htm'; 
    $handle = curl_init($url); 
    curl_setopt($handle, CURLOPT_RETURNTRANSFER, true); 
    $html = curl_exec($handle); 
    libxml_use_internal_errors(true); // Prevent HTML errors from displaying 
    $doc = new DOMDocument(); 
    $doc->loadHTML($html); // get the DOM 

    $xpath = new DOMXPath($doc); // start a new xPath on our DOM Object 
    $preBlock = $xpath->query('//pre'); // find all pre (we only got one here) 

// get the first of all the pre objects 
// get the 'inner value' 
// split them by newlines 
    $preBlockString = explode("\n",$preBlock->item(0)->nodeValue); 
    $startResultBlock = false; 
    $i = 0; 


// traverse all rows 
    foreach ($preBlockString as $line){ 
    // if we found the 'Name' marker within the last row start fetching the results 
    if($startResultBlock){ 
     $result = explode(' ', $line); 
     // kill all empty entries (originating from all the space characters) 
     foreach ($result as $key => $value) if (empty($value)) unset($result[$key]); 
     $results[] = $result; 
     // my first idea to use list does not work because of all the space characters 
     // list($results[$i]['number'], $results[$i]['name'], $results[$i]['age'], $results[$i]['team'], $results[$i]['finals'], $results[$i]['wind'], $results[$i]['points']) = explode(" ", $line); 
     $i++; 
    } 

    // if we found the word 'Name' we set a marker for the upcoming rows 
    if(trim($line) == 'Finals'){ 
     $startResultBlock = true; 

    } 

    } 

    /* This will reorder your array */ 
    $newResult = []; 
    foreach ($results as $result) 
    { 
    $result = array_values($result); 
    array_unshift($result, ''); 
    unset($result[0]); 
    $newResult[] = $result; 
    } 
    $results = $newResult; 
    var_dump($results); 
+0

作品一个魅力,完美重新编号 – Seef