如何使机械化不失败与此页上的窗体？

import mechanize 

url = 'http://steamcommunity.com' 

br=mechanize.Browser(factory=mechanize.RobustFactory()) 

br.open(url) 
print br.request 
print br.form 
for each in br.forms(): 
    print each 
    print

在上面的代码的结果：如何使机械化不失败与此页上的窗体？

Traceback (most recent call last): 
    File "./mech_test.py", line 12, in <module> 
    for each in br.forms(): 
    File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 426, in forms 
    File "build/bdist.linux-i686/egg/mechanize/_html.py", line 559, in forms 
    File "build/bdist.linux-i686/egg/mechanize/_html.py", line 228, in forms 
mechanize._html.ParseError

我的具体目标是使用登录形式，但我甚至无法机械化承认有任何形式。即使使用我认为是选择任意表格的最基本方法，br.select_form(nr=0)也会产生相同的回溯。如果这有所作为，表单的enctype是multipart/form-data。

我想所有的问题都归结为一个两部分的问题：我怎样才能使用这个页面进行机械化工作，或者如果不可能，在维护cookie的时候又有什么其他的方式？

编辑：如下所述，这将重定向到'https://steamcommunity.com'。

机械化可以成功检索HTML作为可以用下面的代码中可以看出：

url = 'https://steamcommunity.com' 

hh = mechanize.HTTPSHandler() # you might want HTTPSHandler, too 
hh.set_http_debuglevel(1) 
opener = mechanize.build_opener(hh) 
response = opener.open(url) 
contents = response.readlines() 

print contents

来源

2009-05-28 Dustin Wyatt

你提到的网站被重定向到一个HTTPS（SSL）的服务器？

那么，尝试建立新的HTTPS处理程序是这样的：

mechanize.HTTPSHandler()

来源

2009-05-28 17:38:11

我没忘了补充一点，信息，谢谢。不幸的是，添加你提到的行并没有改变任何东西。 – 2009-05-29 16:03:57

使用这个秘密，我敢肯定这是你的工作;）

br = mechanize.Browser(factory=mechanize.DefaultFactory(i_want_broken_xhtml_support=True))

来源

2011-05-07 12:11:30

不适用于我，我仍然得到`mechanize._form.ParseError：嵌套的FORMs`。我尝试使用`机械化0.2.5`以及库存Debian`python-mechanize 0.1.11` – koniu 2011-05-07 23:07:16

如何使机械化不失败与此页上的窗体？

回答

相关问题