2012-04-05 108 views
15

我有这样的代码机械化如何获取当前url

require 'mechanize' 
@agent = Mechanize.new 
page = @agent.get('http://something.com/?page=1') 
next_page = page.link_with(:href=>/^?page=2/).click 

正如你可以看到这个代码应该进入下一个页面。

next_page应该有URL http://something.com/?page=2

如何获得当前的URL next_page

回答

22
next_page.uri.to_s 

http://www.rubydoc.info/gems/mechanize/Mechanize/Page/Link#uri-instance_methodhttp://ruby-doc.org/stdlib-2.4.1/libdoc/uri/rdoc/URI.html

出于测试目的,我的确在IRB如下:

require 'mechanize' 
@agent = Mechanize.new 

page = @agent.get('http://news.ycombinator.com/news') 
=> #<Mechanize::Page 
{url #<URI::HTTP:0x00000001ad3198 URL:http://news.ycombinator.com/news>} 
{meta_refresh} 
{title "Hacker News"} 
{iframes} 
{frames} 
{links 
    #<Mechanize::Page::Link "" "http://ycombinator.com"> 
    #<Mechanize::Page::Link "Hacker News" "news"> 
    #<Mechanize::Page::Link "new" "newest"> 
    #<Mechanize::Page::Link "comments" "newcomments"> 
    #<Mechanize::Page::Link "ask" "ask"> 
    #<Mechanize::Page::Link "jobs" "jobs"> 
    #<Mechanize::Page::Link "submit" "submit"> 
    #<Mechanize::Page::Link "login" "newslogin?whence=%6e%65%77%73"> 
    #<Mechanize::Page::Link "" "vote?for=3803568&dir=up&whence=%6e%65%77%73"> 
    #<Mechanize::Page::Link 
    "Don’t Be Evil: How Google Screwed a Startup" 
    "http://blog.hatchlings.com/post/20171171127/dont-be-evil-how-google-screwed-a-startup"> 
    #<Mechanize::Page::Link "mikeknoop" "user?id=mikeknoop"> 
    #<Mechanize::Page::Link "64 comments" "item?id=3803568"> 
    #<Mechanize::Page::Link "" "vote?for=3802515&dir=up&whence=%6e%65%77%73"> 
    # Omitted for brevity... 

next_page.uri 
=> #<URI::HTTP:0x00000001fa7818 URL:http://news.ycombinator.com/news2> 

next_page.uri.to_s 
=> "http://news.ycombinator.com/news2" 
+5

这是链接的网址,但该链接后,当前URL之后(和重定向发生)将是:@ agent.page.uri.to_s – pguardiario 2012-04-06 03:20:10