2012-04-23 56 views
2

我有一个登录到表单的页面。登录后有几个重定向。第一个是这样的:如何从Mechanize :: File对象转换为Mechanize :: Page对象?

#<Mechanize::File:0x1f4ff23 @filename="MYL.html", @code="200", @response={"cache-control"=>"no-cache=\"set-cookie\"", "content-length"=>"114", "set-cookie"=>"JSESSIONID=GdJnPVnhtN91KZfQPc3QzM1NLCyWDsnyvpGg8LL0Knnz3RgqxLFs!1803804592!-2134626567; path=/; secure, COOKIE_TEST=Aslyn; secure", "x-powered-by"=>"Servlet/2.4 JSP/2.0"}, @body="\r\n<html>\r\n <head>\r\n <meta http-equiv=\"refresh\" content=\"0;URL=MYL?Select=OK&StateName=38\">\r\n </head>\r\n</html>", @uri=#<URI::HTTPS:0x16e1eff URL:https://www.manageyourloans.com/MYL?StateName=global_CALMLandingPage&GUID=D1704621-1994-E076-460A-10B2B682B960>> 

所以当我在这里做一个page.class我得到

Mechanize::File 

如何将其转换成一个Mechanize::Page


@pguardiario

为了更好地解释我在我的原始消息的代码存储在页。

当我做page.class我得到机械化::文件

于是我EXCUTE上面代码:

agent = Mechanize.new 
agent.post_connect_hooks << lambda {|http| http[:response].content_type = 'text/html'} 

所以我这样做: agent.get(page.uri.to_s ) 或事件试图用任何URL agent.get( “https://www.manageyourloans.com/MYL”) 我得到一个错误: 引发ArgumentError:错误的参数数目(4 1)

我甚至试过这样:

agent = Mechanize.new { |a| 
    a.post_connect_hooks << lambda { |_,_,response,_| 
    if response.content_type.nil? || response.content_type.empty? 
     response.content_type = 'text/html' 
    end 
    } 
} 

我的问题是一旦我这样做,我如何将前一页转换为一个Mechanize :: Page?

回答

3

您可以通过采取包含在文件对象的身体和传球,在作为新页面的主体从机械化::文件到机械化::页面转换:

irb(main):001:0> require 'mechanize' 
true 
irb(main):002:0> file = Mechanize::File.new(URI.parse('http://foo.com'),nil,File.read('foo.html')) 
#<Mechanize::File:0x100ef0190 
    @full_path = false, 
    attr_accessor :body = "<html><body>foo</body></html>\n", 
    attr_accessor :code = nil, 
    attr_accessor :filename = "index.html", 
    attr_accessor :response = {}, 
    attr_accessor :uri = #<URI::HTTP:0x100ef02d0 
     attr_accessor :fragment = nil, 
     attr_accessor :host = "foo.com", 
     attr_accessor :opaque = nil, 
     attr_accessor :password = nil, 
     attr_accessor :path = "", 
     attr_accessor :port = 80, 
     attr_accessor :query = nil, 
     attr_accessor :registry = nil, 
     attr_accessor :scheme = "http", 
     attr_accessor :user = nil, 
     attr_reader :parser = nil 
    > 
> 

首先,我创建一个虚假的Mechanize :: File对象只是为了让示例代码遵循一个。您可以在:body中看到它读取的文件的内容。

当无法确定真正的内容类型是什么时,机械化会创建一个Mechanize :: File对象。

irb(main):003:0> page = Mechanize::Page.new(URI.parse('http://foo.com'),nil,file.body) 
#<Mechanize::Page:0x100ed5e30 
    @full_path = false, 
    @meta_content_type = nil, 
    attr_accessor :body = "<html><body>foo</body></html>\n", 
    attr_accessor :code = nil, 
    attr_accessor :encoding = nil, 
    attr_accessor :filename = "index.html", 
    attr_accessor :mech = nil, 
    attr_accessor :response = { 
     "content-type" => "text/html" 
    }, 
    attr_accessor :uri = #<URI::HTTP:0x100ed5ed0 
     attr_accessor :fragment = nil, 
     attr_accessor :host = "foo.com", 
     attr_accessor :opaque = nil, 
     attr_accessor :password = nil, 
     attr_accessor :path = "", 
     attr_accessor :port = 80, 
     attr_accessor :query = nil, 
     attr_accessor :registry = nil, 
     attr_accessor :scheme = "http", 
     attr_accessor :user = nil, 
     attr_reader :parser = nil 
    >, 
    attr_reader :bases = nil, 
    attr_reader :encodings = [ 
     [0] nil, 
     [1] "US-ASCII" 
    ], 
    attr_reader :forms = nil, 
    attr_reader :frames = nil, 
    attr_reader :iframes = nil, 
    attr_reader :labels = nil, 
    attr_reader :labels_hash = nil, 
    attr_reader :links = nil, 
    attr_reader :meta_refresh = nil, 
    attr_reader :parser = nil, 
    attr_reader :title = nil 
> 
irb(main):004:0> page.class 
Mechanize::Page < Mechanize::File 

只需传入文件对象的主体并让机械化转换为您应该知道的内容即可。

+0

我的工作,通过这个答案,我使用这个:'code'page =机械化: :Page.new(URI.parse(page.uri.to_s),零,page.body)'code'。我得到一个错误:未定义的方法'[]'为零:NilClass – user1198316 2012-04-24 12:11:17

+0

伟大的答案,适合我! – 2012-09-11 14:59:14

0

我喜欢@The铁皮人的答案,但它可能是简单的强制响应的内容类型:

agent.post_connect_hooks << lambda {|http| http[:response].content_type = 'text/html'} 
+0

当我在irb中这样做时,我得到:undefined method'post_connect_hooks'for# user1198316 2012-04-24 11:57:40

+0

在我的答案代理中引用了一个Mechanize对象,您可以用'Mechanize.new'实例化 – pguardiario 2012-04-24 12:01:19

+0

agent = Mechanize.new agent.post_connect_hooks << lambda {| http | http [:response] .content_type ='text/html'}。阅读它说它检索响应后调用的钩子列表。代理调用挂钩并返回响应。所以我会在我有我的机械化::文件后做到这一点,对吗?那么,如果我做了agent.get(urlofpagehere),那么应该返回Mechanize :: Page? – user1198316 2012-04-24 13:10:14