2
我用这个代码扫描包含与在线扫描仪“https://wepawet.iseclab.org/”使用此凭证文件夹中的多个PDF文件扫描的PDF文件。python脚本使用在线扫描
import mechanize
import re
import os
def upload_file(uploaded_file):
url = "https://wepawet.iseclab.org/"
br = mechanize.Browser()
br.set_handle_robots(False) # ignore robots
br.open(url)
br.select_form(nr=0)
f = os.path.join("200",uploaded_file)
br.form.add_file(open(f) ,'text/plain', f)
br.form.set_all_readonly(False)
res = br.submit()
content = res.read()
with open("200_clean.html", "a") as f:
f.write(content)
def main():
for file in os.listdir("200"):
upload_file(file)
if __name__ == '__main__':
main()
,但我得到了以下错误的代码执行后:
Traceback (most recent call last):
File "test.py", line 56, in <module>
main()
File "test.py", line 50, in main
upload_file(file)
File "test.py", line 40, in upload_file
res = br.submit()
File "/home/suleiman/Desktop/mechanize/_mechanize.py", line 541, in submit
return self.open(self.click(*args, **kwds))
File "/home/suleiman/Desktop/mechanize/_mechanize.py", line 203, in open
return self._mech_open(url, data, timeout=timeout)
File "/home/suleiman/Desktop/mechanize/_mechanize.py", line 255, in _mech_open
raise response
mechanize._response.httperror_seek_wrapper: HTTP Error refresh: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
OK
可以在任何一个可以帮助我解决这个问题?
在我看来,如果是该网站的设计,导致了它。 – HarryCBurn 2014-12-01 22:36:10
你觉得我怎么能解决呢? – 2014-12-01 22:38:29
我不确定,对不起。我假设它没有处理你的代码,但我可能是错的。 – HarryCBurn 2014-12-01 22:43:24