2017-03-05 253 views
0

我有下面的代码,我试图从中提取'Washington square - USA'部分。 它位于div/p/strong内部,但div有一个类,如你所见。无法在xpath中获取div元素中的子元素

下面

是相关的代码,或者你可以看到entire code in pastebin

<div class="content clearfix"> 
<p><strong>Washington square - USA<br> 
</strong></p> 
<p><strong>2 studios for rent – env. 54m2</strong></p> 
<p><strong>near public transport</strong></p> 
<p>Studios comprise</p> 
<ul> 
<li>A kitchen</li> 
<li>A bedroom</li> 
<li>Tolilet with bathtab</li> 
</ul> 
<p>Visitation date (not yet known)</p> 
<p>To rent from 1st april</p> 
<p>(Current owner : Ben)</p> 
<p><strong>For more details visit: http://example.com<br> 
</strong></p> 
<p><strong>&nbsp;</strong></p> 
    </div> 

所以,我曾尝试以下的方法来获得回报所需的输出内容

//div[contains(@class, "content")]/p/strong 
//div[contains(@class, "content") and contains(@class, 'clearfix')]/p/strong 
//div[contains(@class, "content") and contains(@class, 'clearfix')]/p[1]/strong 
//string(div[contains(@class, "content") and contains(@class, 'clearfix')]/p/strong) 
//div[contains(@class, "content") and contains(@class, 'clearfix')]/p/strong/text() 

但没有我想用这段代码解析页面

$document = new \DOMDocument(); 
$document->loadHTMLFile($htmlUrl); 
$xpath = new \DOMXPath($document); 

foreach ($xpath->evaluate('//div[contains(@class, "content")]//p[1]') as $div) { 
    # Also tried with these 
    //div[contains(@class, "content")]/p/strong 
    //div[contains(@class, "content") and contains(@class, 'clearfix')]/p/strong 
    //div[contains(@class, "content") and contains(@class, 'clearfix')]/p[1]/strong 
    //string(div[contains(@class, "content") and contains(@class, 'clearfix')]/p/strong) 
    //div[contains(@class, "content") and contains(@class, 'clearfix')]/p/strong/text() 
    var_dump($div); 
} 
+0

我没有看到任何PHP代码或你的pastebin参考中的XPATH。 – trincot

+0

XPath工作。请显示代码在哪里应用,结果如何,以及您的预期。 – trincot

+0

使用'$ div-> textContent' [在本例中](https://eval.in/748195)。 – trincot

回答

0

元素: // DIV [含有(@class, '内容')]/P [1] /强

然后采取的textContent

或文本: // DIV [含有(@class, '内容')]

和您的XML不形成阱/ p [1] /强/文本():由于< BR>

+0

没有任何工作,因为它们什么都不返回 – user7342807