2011-01-05 48 views
1

我试图用PhpQuery解析一些HTML,但它对我来说并不容易...如何通过PhpQuery使用Foreach?

我只需要提取URL(href标签)到一个数组,但它不工作。

请参阅只是为了举例验证码:

$doc = phpQuery::newDocumentHTML('<div align = "left" style="background-color:#FFFFFF;border:1px solid #C3D9FF"> </p> 

     <table cellPadding="2" cellSpacing="0" width="100%" height="60" style="border-collapse: collapse; "> 

      <tr> 
      <td align="left" width="531" height="20"><small> 
      <strong> 

      <a href="/1153414/"> 

      <font style="FONT-SIZE: 13px; LINE-HEIGHT: 14px">Industrial</font><a/> </a></small></strong> 
      </td> 

      </tr> 
      <tr> 
      <td align="left" vAlign="top" width="100%" height="1"> 
      <table align="left" border="0" cellPadding="0" cellSpacing="0" width="736"> 
       <tr> 
       <td align="left" vAlign="top" width="67"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 
       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px"> 
       Data:</font></strong></td> 

       <td align="left" vAlign="top" width="150"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp;4-1-2011 </font></td> 
       <td align="left" vAlign="top" width="59"> 
       <font color="#000000" face="Arial" size="2"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 
       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">Zona:</font></strong></td> 
       <td align="left" vAlign="top" width="473"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp; Castelo Branco</font></td> 

       </tr> 
       <tr> 
       <td align="left" vAlign="top" width="67"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 
       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">Categoria:</font></strong></td> 
       <td align="left" vAlign="top" width="150"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp;Indústria/Produção </font></td> 
       <td align="left" vAlign="top" width="59"> 

       <font color="#000000" face="Arial" size="2"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 
       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">Empresa:</font></strong></td> 
       <td align="left" vAlign="top" width="473"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp;Isotransfo, Unipessoal LDA</font></td> 
       </tr> 
       </table> 
      </td> 

      </tr> 
     </table> 

</p> 

     <table cellPadding="2" cellSpacing="0" width="100%" height="60" style="border-collapse: collapse; "> 
      <tr> 
      <td align="left" width="531" height="20"><small> 
      <strong> 

      <a href="/1153399/"> 

      <font style="FONT-SIZE: 13px; LINE-HEIGHT: 14px">Admite-se<a/> </a> </font></small></strong> 
      </td> 
      </tr> 
      <tr> 
      <td align="left" vAlign="top" width="100%" height="1"> 
      <table align="left" border="0" cellPadding="0" cellSpacing="0" width="736"> 

       <tr> 
       <td align="left" vAlign="top" width="67"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 
       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px"> 
       Data:</font></strong></td> 
       <td align="left" vAlign="top" width="150"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp;4-1-2011 </font></td> 
       <td align="left" vAlign="top" width="59"> 

       <font color="#000000" face="Arial" size="2"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 
       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">Zona:</font></strong></td> 
       <td align="left" vAlign="top" width="473"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp; Castelo Branco</font></td> 
       </tr> 
       <tr> 

       <td align="left" vAlign="top" width="67"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 
       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">Categoria:</font></strong></td> 
       <td align="left" vAlign="top" width="150"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp;Indústria/Produção </font></td> 
       <td align="left" vAlign="top" width="59"> 
       <font color="#000000" face="Arial" size="2"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 

       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">Empresa:</font></strong></td> 
       <td align="left" vAlign="top" width="473"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp;Isotransfo, Unipessoal LDA</font></td> 
       </tr> 
       </table> 
      </td> 
      </tr> 
     </table> 

</p> 

     <table cellPadding="2" cellSpacing="0" width="100%" height="60" style="border-collapse: collapse; "> 
      <tr> 
      <td align="left" width="531" height="20"><small><font face="Arial"> 
      <strong> 

      <a href="/1153280/"> 

      <font style="FONT-SIZE: 13px; LINE-HEIGHT: 14px">Precisa-se</font><a/> </a> </font></small></strong> 

      </td> 
      </tr> 
      <tr> 
      <td align="left" vAlign="top" width="100%" height="1"> 
      <table align="left" border="0" cellPadding="0" cellSpacing="0" width="736"> 
       <tr> 
       <td align="left" vAlign="top" width="67"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 
       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px"> 

       Data:</font></strong></td> 
       <td align="left" vAlign="top" width="150"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp;4-1-2011 </font></td> 
       <td align="left" vAlign="top" width="59"> 
       <font color="#000000" face="Arial" size="2"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 
       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">Zona:</font></strong></td> 

       <td align="left" vAlign="top" width="473"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp; (Todas as Zonas)</font></td> 
       </tr> 
       <tr> 
       <td align="left" vAlign="top" width="67"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 
       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">Categoria:</font></strong></td> 

       <td align="left" vAlign="top" width="150"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp;Saúde/Medicina/Enfermagem </font></td> 
       <td align="left" vAlign="top" width="59"> 
       <font color="#000000" face="Arial" size="2"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 
       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">Empresa:</font></strong></td> 
       <td align="left" vAlign="top" width="473"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp;Emprego Radiologia</font></td> 

       </tr> 
       </table> 
      </td> 
      </tr> 
     </table> 

</p> 

     <table cellPadding="2" cellSpacing="0" width="100%" height="60" style="border-collapse: collapse; "> 
      <tr> 

      <td align="left" width="531" height="20"><small><font face="Arial"> 
      <strong> 

      <a href="/1152665/"> 

      <font style="FONT-SIZE: 13px; LINE-HEIGHT: 14px">Operadores</font><a/> </a> </font></small></strong> 
      </td> 
      </tr> 

      <tr> 
      <td align="left" vAlign="top" width="100%" height="1"> 
      <table align="left" border="0" cellPadding="0" cellSpacing="0" width="736"> 
       <tr> 
       <td align="left" vAlign="top" width="67"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 
       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px"> 
       Data:</font></strong></td> 

       <td align="left" vAlign="top" width="150"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp;4-1-2011 </font></td> 
       <td align="left" vAlign="top" width="59"> 
       <font color="#000000" face="Arial" size="2"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 
       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">Zona:</font></strong></td> 
       <td align="left" vAlign="top" width="473"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp; Viseu</font></td> 

       </tr> 
       <tr> 
       <td align="left" vAlign="top" width="67"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 
       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">Categoria:</font></strong></td> 
       <td align="left" vAlign="top" width="150"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp;Lojas/Comércio/Balcão </font></td> 
       <td align="left" vAlign="top" width="59"> 

       <font color="#000000" face="Arial" size="2"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 
       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">Empresa:</font></strong></td> 
       <td align="left" vAlign="top" width="473"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp;Dia Portugal Supermercados - Soc. Unip., Lda.</font></td> 
       </tr> 
       </table> 
      </td> 

      </tr> 
     </table> 

</p> 

     <table cellPadding="2" cellSpacing="0" width="100%" height="60" style="border-collapse: collapse; "> 
      <tr> 
      <td align="left" width="531" height="20"><small><font face="Arial"> 
      <strong> 

      <a href="/1153524/"> 

      <font style="FONT-SIZE: 13px; LINE-HEIGHT: 14px">Responsável</font><a/> </a> </font></small></strong> 
      </td> 
      </tr> 
      <tr> 
      <td align="left" vAlign="top" width="100%" height="1"> 
      <table align="left" border="0" cellPadding="0" cellSpacing="0" width="736"> 

       <tr> 
       <td align="left" vAlign="top" width="67"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 
       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px"> 
       Data:</font></strong></td> 
       <td align="left" vAlign="top" width="150"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp;4-1-2011 </font></td> 
       <td align="left" vAlign="top" width="59"> 

       <font color="#000000" face="Arial" size="2"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 
       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">Zona:</font></strong></td> 
       <td align="left" vAlign="top" width="473"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp; Santarem</font></td> 
       </tr> 
       <tr> 

       <td align="left" vAlign="top" width="67"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 
       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">Categoria:</font></strong></td> 
       <td align="left" vAlign="top" width="150"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp;Comercial/Vendas </font></td> 
       <td align="left" vAlign="top" width="59"> 
       <font color="#000000" face="Arial" size="2"> 
       <strong style="FONT-SIZE: 11px; LINE-HEIGHT: 14px; font-weight:400"> 

       <font face="Arial" color="#333333" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">Empresa:</font></strong></td> 
       <td align="left" vAlign="top" width="473"> 
       <font face="Arial" style="FONT-SIZE: 11px; LINE-HEIGHT: 14px">&nbsp;ALDI Supermercados Lda.</font></td> 
       </tr> 
       </table> 
      </td> 
      </tr> 
     </table> 
</div>'); 
//echo $doc['div table a']->attr('href'); 
foreach ($doc['div table a'] as $a) { 
    $hrefs[] .= pq($a)->attr('href'); 
} 
print_r ($hrefs); 

如果我回声出码波纹管得到它是我唯一href网址,并且它是确定:

echo $doc['div table a']->attr('href'); 

如果我运行foreach语句我得到了一个空值的数组:

foreach ($doc['div table a'] as $a) { 
    $hrefs[] .= pq($a)->attr('href'); 
} 
print_r ($hrefs); 

我得到的数组是:

Array ( 
    [0] => /1153414/ 
    [1] => 
    [2] => /1153399/ 
    [3] => 
    [4] => /1153280/ 
    [5] => 
    [6] => /1152665/ 
    [7] => 
    [8] => /1153524/ 
    [9] => 
    ) 

怎样才能像这样的数组:

Array ( 
    [0] => /1153414/ 
    [1] => /1153399/ 
    [2] => /1153280/ 
    [3] => /1152665/ 
    [4] => /1153524/ 
    ) 

如果你可以给我一些线索,我将不胜感激。

对不起我的英文不好

最好的问候,

回答

3

你的<a/>五个实例在你的代码。这会创建一个空的a元素,而不是关闭现有元素。删除它们,你的代码应该可以正常工作。


编辑从数组中除去空值的非常简单的方式运行array_filter与没有第二个参数:

$hrefs = array_filter($hrefs); 
+0

如何!我懂了。 HTML不是我的,来自我需要取消的网站。谢谢您的帮助。 – 2011-01-05 10:19:07

1
if (pq($a)->attr('href') != '') { 
    $hrefs[] .= pq($a)->attr('href'); 
} 
+0

感谢您的答复。有用!你知道为什么当我在foreach上指定“attr('href')”吗? – 2011-01-05 10:15:56

+0

@问题是什么,你的解决方案解决了问题“array_filter($ hrefs)”以无声方式完成工作。 – 2011-01-05 10:55:04

+0

@Andre,寂寞天更快解释:) – 2011-01-06 10:39:58