2017-09-01 100 views
0

这是一段HTML,从中我想提取信息:条件XPath语句

<li> 
    <p><strong class="more-details-section-header">Provenance</strong></p> 
    <p>Galerie Max Hetzler, Berlin<br>Acquired from the above by the present owner</p> 
    </li> 

我想有一个XPath表达式取决于是否有其提取第二<p> ... </p>的内容与<p> ... Provenance ... </p>

此之前,一个兄弟就是在那里我得到迄今:

if "Provenance" in response.xpath('//strong[@class="more-details-section-header"]/text()').extract(): 
      print("provenance = yes") 

但我怎么去Galerie Max Hetzler, Berlin<br>Acquired from the above by the present owner

我试图

if "Provenance" in response.xpath('//strong[@class="more-details-section-header"]/text()').extract(): 
      print("provenance = yes ", response.xpath('//strong[@class="more-details-section-header"]/following-sibling::p').extract()) 

但我得到[]

回答

1

您应该使用

//p[preceding-sibling::p[1]/strong='Provenance']/text() 
+0

或者更精确地说“// P [前同辈:: P [1] = '种源'] /文本()” – SIM