xpath不能选择只有一个html标记

我想从网站获取一些数据，但是当我使用下面的代码时，它将返回所有匹配的元素，我只想返回第一个匹配项！我试过了extract_first，但它没有返回！xpath不能选择只有一个html标记

# -*- coding: utf-8 -*- 
import scrapy 
from gumtree.items import GumtreeItem 



class FlatSpider(scrapy.Spider): 
    name = "flat" 
    allowed_domains = ["gumtree.com"] 
    start_urls = (
     'https://www.gumtree.com/flats-for-sale', 
    ) 

    def parse(self, response): 
     item = GumtreeItem() 
     item['title'] = response.xpath('//*[@class="listing-title"][1]/text()').extract() 
     return item

如何用xpath选择器只选择一个元素？

来源

2016-09-19 Mohib

这是因为第一个元素实际上是空的 - 仅过滤出非空值，并使用extract_first() - 为我工作：

$ scrapy shell "https://www.gumtree.com/flats-for-sale" -s USER_AGENT="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.113 Safari/537.36" 
In [1]: response.xpath('//*[@class="listing-title"][1]/text()[normalize-space(.)]').extract_first().strip() 
Out[1]: u'REDUCED to sell! Stunning Hove sea view flat.'

来源

2016-09-19 13:24:01 alecxe

严格地说应该是response.xpath('(//*[@class="listing-title"])[1]/text()')但如果你想要的是抓住每个广告的标题（例如创建一个项目），你可能应该这样做：

for article in response.xpath('//article[@data-q]'): 
    item = GumtreeItem() 
    item['title'] = article.css('.listing-title::text').extract_first() 
    yield item

来源

2016-09-22 18:36:35 Wilfredo

xpath不能选择只有一个html标记

回答

相关问题