2013-03-02 46 views
1

我正在尝试Scrapy。我有以下几点:Python/Xpath转换查询

hxs.select('//span[contains(@itemprop, "price")]').extract() 

输出:

[u'<span itemprop="price" class="offer_price">\n<span class="currency">\u20ac</span>\n16<span class="offer_price_fraction">,95</span>\n</span>'] 

我怎样才能检索到的输出:

16.95 

换句话说,与部分价格跨度+更换添加价格的,与。

回答

1

使用此XPath表达式:

translate(
      concat(//span[@itemprop = 'price']/text()[normalize-space()], 
        //span[@itemprop = 'price']/span[@class='offer_price_fraction'] 
        ), 
      ',', 
      '.' 
      ) 

基于XSLT的ve rification:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> 
<xsl:output omit-xml-declaration="yes" indent="yes"/> 

<xsl:template match="/"> 
    <xsl:copy-of select= 
    "translate(
      concat(//span[@itemprop = 'price']/text()[normalize-space()], 
        //span[@itemprop = 'price']/span[@class='offer_price_fraction'] 
       ), 
      ',', 
      '.' 
      )"/> 
</xsl:template> 
</xsl:stylesheet> 

当该转化此XML文档上施加:

<span itemprop="price" class="offer_price"> 
    <span class="currency">\u20ac</span> 
16<span class="offer_price_fraction">,95</span> 
</span> 

XPath表达式求值和该评价的结果被复制到输出:

16.95 
+0

哇,真棒。谢谢!! – 2013-03-02 21:10:14

+0

@MauriceKroon,不客气。 – 2013-03-02 21:11:25

1

这里是我拥有的XPath选择器设置:

>>> hxs.extract() 
u'<html><body><span itemprop="price" class="offer_price">\n<span class="currency">\u20ac</span>\n16<span class="offer_price_fraction">,95</span>\n</span></body></html>' 

,这里是你如何能达到预期的效果:

>>> price = 'descendant::span[@itemprop="price"]' 
>>> whole = 'text()' 
>>> fract = 'descendant::span[@class="offer_price_fraction"]/text()' 
>>> s = hxs.select(price).select('%s | %s' % (whole, fract)).extract() 
>>> s 
[u'\n', u'\n16', u',95', u'\n'] 
>>> ''.join(s).strip().replace(',', '.') 
u'16.95' 
+0

不错,谢谢!忘记了后代的​​事情,愚蠢的我..虽然感谢! – 2013-03-02 21:10:43