2012-08-02 69 views
0

我试图在Chrome浏览器中使用watir-webdriver浏览一组网站,但我总是在某些网站上遇到错误。最近,我遇到了http://adage.com这个问题。直到它到达http://adage.com循环将完全执行,然后它会挂起,直到显示以下错误:使用watir-webdriver循环浏览一组网址时跳过较慢的网站

/Users/default/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/net/protocol.rb:146:in `rescue in rbuf_fill': Timeout::Error (Timeout::Error) 
from /Users/default/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/net/protocol.rb:140:in `rbuf_fill' 
from /Users/default/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/net/protocol.rb:122:in `readuntil' 
from /Users/default/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/net/protocol.rb:132:in `readline' 
from /Users/default/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/net/http.rb:2562:in `read_status_line' 
from /Users/default/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/net/http.rb:2551:in `read_new' 
from /Users/default/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/net/http.rb:1319:in `block in transport_request' 
from /Users/default/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/net/http.rb:1316:in `catch' 
from /Users/default/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/net/http.rb:1316:in `transport_request' 
from /Users/default/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/net/http.rb:1293:in `request' 
from /Users/default/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/net/http.rb:1286:in `block in request' 
from /Users/default/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/net/http.rb:745:in `start' 
from /Users/default/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/net/http.rb:1284:in `request' 
from /Users/default/.rvm/gems/ruby-1.9.3-p125/gems/selenium-webdriver-2.25.0/lib/selenium/webdriver/remote/http/default.rb:82:in `response_for' 
from /Users/default/.rvm/gems/ruby-1.9.3-p125/gems/selenium-webdriver-2.25.0/lib/selenium/webdriver/remote/http/default.rb:38:in `request' 
from /Users/default/.rvm/gems/ruby-1.9.3-p125/gems/selenium-webdriver-2.25.0/lib/selenium/webdriver/remote/http/common.rb:40:in `call' 
from /Users/default/.rvm/gems/ruby-1.9.3-p125/gems/selenium-webdriver-2.25.0/lib/selenium/webdriver/remote/bridge.rb:598:in `raw_execute' 
from /Users/default/.rvm/gems/ruby-1.9.3-p125/gems/selenium-webdriver-2.25.0/lib/selenium/webdriver/remote/bridge.rb:576:in `execute' 
from /Users/default/.rvm/gems/ruby-1.9.3-p125/gems/selenium-webdriver-2.25.0/lib/selenium/webdriver/remote/bridge.rb:536:in `getActiveElement' 
from /Users/default/.rvm/gems/ruby-1.9.3-p125/gems/selenium-webdriver-2.25.0/lib/selenium/webdriver/common/target_locator.rb:60:in `active_element' 
from /Users/default/.rvm/gems/ruby-1.9.3-p125/gems/watir-webdriver-0.6.1/lib/watir-webdriver/browser.rb:136:in `send_keys' 
from /Users/default/Dropbox/beta_scripts/loop_test.rb:16:in `rescue in <main>' 
from /Users/default/Dropbox/beta_scripts/loop_test.rb:11:in `<main>' 

我不知道如何来避免这种情况。我试图设置超时,甚至在救援期间发送ESC键以阻止Chrome加载页面,但没有取得任何成功。最终,我希望能够可靠地连续加载500多个网站的阵列,但这似乎是不可能的,因为其中一个网站可能会挂起。 有什么办法可以阻止加载缓慢的页面并移动到数组中的下一个元素?

下面是我的代码缩短版来隔离问题:

#!/usr/bin/env ruby 

require 'watir-webdriver' 

b = Watir::Browser.new :chrome 

sites = ["twitter.com", "cars.com", "autotrader.com", "rolex.com", "newyorker.com", "adage.com", "theatlantic.com", "pcmag.com"] 

sites.each do |uri| 
    begin 
    Timeout::timeout(10) do 
     b.goto uri 
    end 
    rescue Timeout::Error => e_time 
    sleep 5 
    b.send_keys :escape 
    p "#{uri} is taking forever to load (#{e_time})" 
    rescue Exception => e_exception 
    p e_exception 
    end 
end 

b.close 
+0

我不明白蟒蛇,但估计这样的网站加载时间可能不是好主意。我之所以这么说是因为一些页面有复杂的页面加载Ajax请求,即使主站点已经完成加载,也需要花费时间来完成。可能使用不稳定版本的Firefox功能可能会对您有所帮助。 – iMatoria 2012-08-02 09:08:51

+0

它不是python;它是红宝石.. – 2012-08-02 09:31:53

+0

增加你的超时时间,并尝试。 – Santoshsarma 2012-08-02 14:26:39

回答

0

嗯,我能理解你的无奈队友,因为我与硒的webdriver打交道时也遇到同样的。在这里,您需要做的就是100%确保您的脚本将运行完美无缺,直到500多个网站结束。

sites.each do |uri|  
!30.times { if ((b.goto uri)rescue false)then break else sleep 1; end }  
    end  

上面的代码将尝试访问每个网站最多30秒,然后移动到下一个网站。

+0

你的代码在执行时似乎会引发错误。我清理了它(使'Rescue'小写,并在'rescue'之前移动了最终的'end'语句),但它似乎仍然不起作用。虽然我在访问adage.com时没有得到上述错误,但Chrome仍然无限期地挂起,watir-webdriver不会移动到阵列中的下一个项目。有任何想法吗? – ARP 2012-08-02 15:52:39

+0

我拿回来,我仍然有上面的错误。它花费了很多很多时间。 – ARP 2012-08-02 16:02:21

+0

那么这个代码是为硒webdriver工作..我试图将它转换为相当于Watir ..它应该是非常接近的东西.. ..我再试一次.. – 2012-08-03 09:20:06