2013-04-10 153 views
1

我想使用Python 2.7和BeautifulSoup刮网页,但我不能通过一个协议错误,这对我来说没有多大意义。这只是发生在特定的网站,我需要做的这对:https://edd.telstra.com/telstraPython的SSL网页抓取

的代码,我只使用了基本测试:

#! /usr/bin/python 

from urllib import urlopen 
from BeautifulSoup import BeautifulSoup 
import re 

# Copy all of the content from the provided web page 
webpage = urlopen("https://edd.telstra.com/telstra/").read() 

而且我得到以下错误(在Ubuntu 12.10上运行):

Traceback (most recent call last): 
File "e.py", line 8, in <module> 
webpage = urlopen("https://edd.telstra.com/telstra/").read() 
File "/usr/lib/python2.7/urllib.py", line 86, in urlopen 
return opener.open(url) 
File "/usr/lib/python2.7/urllib.py", line 207, in open 
return getattr(self, name)(url) 
File "/usr/lib/python2.7/urllib.py", line 436, in open_https 
h.endheaders(data) 
File "/usr/lib/python2.7/httplib.py", line 958, in endheaders 
self._send_output(message_body) 
File "/usr/lib/python2.7/httplib.py", line 818, in _send_output 
self.send(msg) 
File "/usr/lib/python2.7/httplib.py", line 780, in send 
self.connect() 
File "/usr/lib/python2.7/httplib.py", line 1165, in connect 
self.sock = ssl.wrap_socket(sock, self.key_file, self.cert_file) 
File "/usr/lib/python2.7/ssl.py", line 381, in wrap_socket 
ciphers=ciphers) 
File "/usr/lib/python2.7/ssl.py", line 143, in __init__ 
self.do_handshake() 
File "/usr/lib/python2.7/ssl.py", line 305, in do_handshake 
self._sslobj.do_handshake() 
IOError: [Errno socket error] [Errno 1] _ssl.c:504: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac 

有人能告诉我是否有一些参数需要指定,以便让此页面在Python中下载?这似乎是这个网页上的问题,因为上面的代码(以及我尝试过的其他许多代码)在我尝试的其他HTTPS/SSL页面上工作正常。

感谢您的帮助!

+0

我有一个类似的问题,和整个这次来到http://bugs.debian.org/cgi-bin /bugreport.cgi?bug=678353这似乎表明openssl版本问题,并且该问题将在1.0.1e-2中解决。 – 2014-01-15 22:54:04

回答