2016-04-26 135 views
-1

我正在寻找从网站,从表中抓取数据,并在PHP中显示在一个干净的表。PHP从表HTML标记网页刮

网站示例如下,您会注意到飞行数据表。关于如何让PHP能够遍历数据并将其放入表中?

Data Example

回答

0

是的,我会建议使用XPath

<h1>This is scraping flight radar:</h1> 
    <?php 
    $url = "https://www.flightradar24.com/data/flights/southwest-airlines-wn-swa"; 
    $html = file_get_contents($url); 
    libxml_use_internal_errors(true); 
    $doc = new \DOMDocument(); 
    if($doc->loadHTML($html)) 
    { 
     $result = new \DOMDocument(); 
     $result->formatOutput = true; 
     $table = $result->appendChild($result->createElement("table")); 
     $thead = $table->appendChild($result->createElement("thead")); 
     $tbody = $table->appendChild($result->createElement("tbody")); 

     $xpath = new \DOMXPath($doc); 

     $newRow = $thead->appendChild($result->createElement("tr")); 

     foreach($xpath->query("//table[@id='tbl-datatable']/thead/tr/th[position()>1]") as $header) 
     { 
      $newRow->appendChild($result->createElement("th", trim($header->nodeValue))); 
     } 

     foreach($xpath->query("//table[@id='tbl-datatable']/tbody/tr") as $row) 
     { 
      $newRow = $tbody->appendChild($result->createElement("tr")); 

      foreach($xpath->query("./td[position()>1 and position()<7]", $row) as $cell) 
      { 
       $newRow->appendChild($result->createElement("td", trim($cell->nodeValue))); 
      } 
     } 

     echo $result->saveXML($result->documentElement); 
    } 
    ?> 
+0

这真的很好。我将如何撤回位于Flight No和“Live”链接后面的URL? – DARKOCEAN

0

与任何刮努力,请记住,您可能会违反他们的服务条款,特别是如果你重新排版内容。也就是说,https://github.com/FriendsOfPHP/Goutte非常适合这样的任务。

<?php 
require 'vendor/autoload.php'; 
use Goutte\Client; 

$data_url = 'https://www.flightradar24.com/data/flights/southwest-airlines-wn-swa'; 
$client = new Client(); 
$crawler = $client->request('GET', $data_url); 
$crawler->filter('#tbl-datatable')->each(function ($node) { 
    print $node->html()."\n"; 
}); 
+0

数据未被重新发布。这是供个人使用的。另外,我们向网站支付了大量的订阅费用,允许使用各种方法访问数据。 – DARKOCEAN