2013-03-28 205 views
0

我是新来的使用硒在网站上执行web自动化,我很难提取两个div标签之间的文本。硒,如何提取两个div标签之间的文本

这是我尝试从中提取文本的HTML代码的剪切位。

... 
<tr> 
    <td width="150"> 
    <a href="http://rads.stackoverflow.com/amzn/click/B0099RGRT8"> 
    <img height="90" border="0" width="90" alt="iOttie Easy Flex2 Windshield Dashboard Car Mount H&hellip by iOttie" src="http://ecx.images-amazon.com/images/I/51mf6Ry9J2L._SL500_SS90_.jpg"> 
    </a> 
    <div class="xxsmall" style="margin-top: 5px"> 
     <a href="http://rads.stackoverflow.com/amzn/click/B0099RGRT8">iOttie Easy Flex2 Windshield Dashboard Car Mount Holder Desk Stand for iPhone 5 4S 4 3GS Samsung Gal&amp;hellip</a> 
     by iOttie 
    </div> 
    </td> 
    <td style="padding-left: 10px;"> 
     <div> 
      <div> 
       <span style="margin-left:-5px; vertical-align: -1"> 

       </span> 
       <b> 
       <a href="http://www.amazon.com/gp/cdp/member-reviews/A2UQ07EFPSX78X/ref=cm_pdp_rev_title_1?ie=UTF8&sort_by=MostRecentReview#R12ATB4KTIWFV8">Bought for my wife, now I want one. Excellent Product.</a> 
       </b> 
       , 
       <span class="nowrap">November 30, 2012</span> 
      </div> 
      <div style="margin-top: 5px;"> 
       I bought this mount for my wife, the feedback from her was is that it was really nice and easy to use even while driving. 
       <br> 
       <br> 
       So I "borrowed" it for a couple days, and now I am going to get one for myself. I am using it with an iPhone, but it would work fine with phones of all sizes, which is nice. If my phone size ever changes the mount will accommodate different sizes phones. 
       <br> 
       <br> 
       The phone is very easy to insert and remove , even while driving. 
       <br> 
       The mount is easy to position but not loose enough that it doesn't hold the position you want. 
       <br> 
       <br> 
       I was very impressed with the windshield mount, it is not just a typical suction cup mount. (Which always at some point… 
       <a href="http://www.amazon.com/gp/cdp/member-reviews/A2UQ07EFPSX78X/ref=cm_pdp_rev_more?ie=UTF8&sort_by=MostRecentReview#R12ATB4KTIWFV8">Read more</a> 
      </div> 
     </div> 
    </td> 
</tr> 
... 

其他div标签实际上包含其他文字也是如此。

我想从中提取的是: 我为我的妻子购买了这座山,她的反馈是,即使在开车的时候它也非常好用,而且很容易使用。

  I bought this mount for my wife, the feedback from her was is that it was really nice and easy to use even while driving. 

      So I "borrowed" it for a couple days, and now I am going to get one for myself. I am using it with an iPhone, but it would work fine with phones of all sizes, which is nice. If my phone size ever changes the mount will accommodate different sizes phones. 

      The phone is very easy to insert and remove , even while driving. 

      The mount is easy to position but not loose enough that it doesn't hold the position you want. 

      I was very impressed with the windshield mount, it is not just a typical suction cup mount. (Which always at some point… 

这是我的代码:

String review; 
try { 
    review = WebElement.bucketElement.findElement(By.xpath("./td/div")).getText(); 
} catch (NoSuchElementException nsee) { 
    review = "NA"; 
} 

这实际上提取所有所有最内侧的div标签的文字是不是我想要的。我可以使用./td/div/div[3]来定位特定的div标签,但我无法获取div标签之间的文本。

有什么想法?

感谢

+0

你有正确的html片段/你想要提取什么?例如,片段中不包含“绝对”一词。 – Taylor 2013-03-29 16:50:38

+0

对不起,我不知道我粘贴了什么...我已经更新了这个问题。 – Kitizhi 2013-03-30 20:09:19

回答

1

你可以使用普通expresions作为一种解决方法:

String review; 
try { 
    review = WebElement.bucketElement.findElement(By.xpath("./td/div")).getText(); 
    review.replaceAll("(<.+>)", ""); 
} catch (NoSuchElementException nsee) { 
    review = "NA"; 
} 

正则表达式中删除所有的标签和内部元素的文本。只剩下第一级文字。这意味着,如果您有:

some strange<div>other text</div> text 结果字符串将是:some strange text

如果您需要更复杂的正规表示here is useful link to test it

+0

感谢您的回复Zygimtantas,但它似乎像你的解决方案不工作。它仍然抓取其他div标签的内部文本。也许我需要稍微更新数据集,以便其他div标签中的文本更加明显。 – Kitizhi 2013-03-29 16:43:33

+0

通过一些调整,我设法使用正则表达式获得期望的结果。谢谢! – Kitizhi 2013-04-02 05:15:12

+0

小心提到你做了哪些调整@Kitizhi? – 2016-05-23 14:28:31

0

发现使用元素后/ TD/DIV/DIV [3],如果你在这个webelement做的getText(),它会回报你在这个div /元素的文本。

相关问题