2016-09-19 124 views
0

尝试在php中筛选废料内容并分配给数组。我需要使用的库 'SimpltHTMLDom' REFFERENCE以下数据:Parse html table using file_get_contents to php array使用SimpleHTMLDom的屏幕抓取PHP

期望的结果:

  • <text>(CSS)医院名称的背景颜色(如果存在的!!!)
    • 需要所有五个<text>(如空,如果没有背景色)

阵列:

Hospital 1  
--> NULL  
--> #ff0000  
--> 08:50  
--> NULL  
--> NULL 

Hospital 2  
--> #ffff00  
--> 08:50  
--> NULL  
--> NULL  
--> NULL 

PHP:

<?php 
require('simple_html_dom.php'); 
$table = array(); 


$html = file_get_html('https://www.miemssalert.com/chats/Default.aspx?hdRegion=3'); 
foreach($html->find('table#tblHospitals tr') as $row) { 
    $hospital = $row->find('td.Chats',0)->plaintext; 
    $color = $row->getAttribute('td.Chats style',2); 
    $time = $row->find('td.Chats',2)->plaintext; 
    //$text = $row->getAttribute('alt'); 

$table[$hospital][$color][$time][$text] = true; 

} 

echo '<pre>'; 
print_r($table); 
echo '</pre>'; 
?> 

的DOM的HTML(这是在网页的小样品):

<div id="Page1" style="display: none; width: 100%;"> 
           <div id="HospitalUpdatePanel"> 

             <table id="tblHospitals" cellspacing="0" cellpadding="1" align="Left" rules="all" border="1" style="border-color:Black;border-width:1px;border-style:Solid;width:100%;border-collapse:collapse;table-layout: fixed;"> 
     <tr> 
      <th title="Hospital" class="Chats" style="background-color:Silver;font-weight:bold;width:25%;">Hospital</th><th title="The emergency department temporarily requests that it receive absolutely no patients in need of urgent medical care. Yellow alert is initiated because the Emergency dept is experiencing a temporary overwhelming overload such that priority II and III patients may not be managed safely. Prior to diverting pediatric patients, medical consultation is advised for pediatric patient transports when emergency departments are on yellow alert." class="Chats" style="font-weight:bold;width:9%;background-color:#ffff00;color:#000000;">Yellow Alert</th><th title="The hospital has no ECG monitored beds available. These ECG monitored beds will include all in-patient critical care areas and telemetry beds." class="Chats" style="font-weight:bold;width:9%;background-color:#ff0000;color:#000000;">Red Alert</th><th title="The emergency department reports that their facility has, in effect, suspended operation and can receive absolutely no patients due to a situation such as a power-outage, fire, gas leak, bomb scare, etc." class="Chats" style="font-weight:bold;width:9%;background-color:#006600;color:#ffffff;">Mini Disaster</th><th title="An ALS/BLS unit is being held in the emergency department of a hospital due to lack of an available bed. (This does not replace Yellow Alert.)" class="Chats" style="font-weight:bold;width:9%;background-color:#ff6600;color:#000000;">ReRoute</th><th title="The hospital's ability to function as a trauma center has been exceeded. (This decision is at the discretion of the facility.)" class="Chats" style="font-weight:bold;width:9%;background-color:#9933cc;color:#ffffff;">Trauma ByPass</th><th title="The hospital's capacity has been exceeded." class="Chats" style="font-weight:bold;width:9%;background-color:#000000;color:#ffffff;">Capacity</th> 
     </tr><tr> 
      <td class="Chats">Anne Arundel Medical Center</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">Baltimore Washington Medical Center</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">Bon Secours Hospital</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">Carroll Hospital Center</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">Franklin Square (MedStar)</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">Good Samaritan Hospital (MedStar)</td><td class="Chats"></td><td class="Chats" style="background-color:#ff0000;color:#000000;">08:50</td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">Greater Baltimore Medical Center</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">Harbor Hospital (MedStar)</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">Harford Memorial Hospital (UMUCH)</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">Howard County General Hospital (JHM)</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">Johns Hopkins Bayview Medical Center</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">Johns Hopkins Hospital</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">Johns Hopkins Hospital (Pediatric ED)</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">Mercy Medical Center</td><td class="Chats" style="background-color:#ffff00;color:#000000;">08:50</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">Midtown (UM)</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">Northwest Hospital</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">R Adams Cowley Shock Trauma Center</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td> 
     </tr><tr> 
      <td class="Chats">Sinai Hospital of Baltimore</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">St. Agnes Hospital</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">St. Joseph’s  (UM)</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">Union Memorial Hospital  (MedStar)</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">University of Maryland Medical Center</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td><td class="Chats-null"></td> 
     </tr><tr> 
      <td class="Chats">Upper Chesapeake Medical Center (UMUCH)</td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats"></td><td class="Chats-null"></td><td class="Chats-null"></td> 
     </tr> 
    </table> 
             <span id="lblHospitalsErrorMessage" style="color:Red;font-weight:bold;visibility: hidden;"></span> 

</div> 
          </div> 

REVISED PHP ABOVE: 这里是输出,仍然不期望的结果?

[Good Samaritan Hospital (MedStar)] => Array 
    (
     [0] => Array 
      (
       [11:58] => Array 
        (
         [0] => 1 
        ) 

      ) 

    ) 

回答

0

有几个问题与发布代码:

  1. find()方法接受一个CSS选择器,而不是HTML标记。如果您想查找<table id="tblHospitals">,请使用table#tblHospitals等。
  2. foreach($html->find(table#tblHospitals') as $row)将迭代单个表格元素,而不是行。你可能想使用一个选择器,选择实际的行元素,如:table#tblHospitals tr
+0

我只是越来越︰数组 ( [] =>阵列 ( [] =>阵列 – BarclayVision

+0

修订PHP的问题:仍然只retu下面列出第一个医院名称和空白元素? – BarclayVision

+0

修正PHP中的问题:得到所有的医院名称和时间,但仍然需要为css不明文的免费颜色? – BarclayVision

0

变成了只有两行代码:

<?php 
require('simple_html_dom.php'); 

$html = file_get_html('https://www.miemssalert.com/chats/Default.aspx?hdRegion=3'); 
foreach($html->find('table#tblHospitals tr td.Chats') as $e) 
    echo $e->plaintext . $e->getAttribute('style') . '<hr>'; 
?> 

结果数组类似:

array(37) { 
    ["Anne Arundel Medical Center"]=> 
    array(1) { 
    [0]=> 
    bool(true) 
    } 
    [""]=> 
    array(1) { 
    [0]=> 
    bool(true) 
    } 
    ["Baltimore Washington Medical Center"]=> 
    array(1) { 
    [0]=> 
    bool(true) 
    } 
    ["04:31"]=> 
    array(1) { 
    ["background-color:#ffff00;color:#000000;"]=> 
    bool(true) 
    } 
    ["Bon Secours Hospital"]=> 
    array(1) { 
    [0]=> 
    bool(true) 
    }