我必须实现一个函数来使用urllib2只能获取头文件(不需要执行GET或POST)。这里是我的功能:python只能使用urllib2获取头文件
def getheadersonly(url, redirections = True):
if not redirections:
class MyHTTPRedirectHandler(urllib2.HTTPRedirectHandler):
def http_error_302(self, req, fp, code, msg, headers):
return urllib2.HTTPRedirectHandler.http_error_302(self, req, fp, code, msg, headers)
http_error_301 = http_error_303 = http_error_307 = http_error_302
cookieprocessor = urllib2.HTTPCookieProcessor()
opener = urllib2.build_opener(MyHTTPRedirectHandler, cookieprocessor)
urllib2.install_opener(opener)
class HeadRequest(urllib2.Request):
def get_method(self):
return "HEAD"
info = {}
info['headers'] = dict(urllib2.urlopen(HeadRequest(url)).info())
info['finalurl'] = urllib2.urlopen(HeadRequest(url)).geturl()
return info
从答案this和this使用代码。但是,即使该标志为False
,此也在做重定向。我试了一下代码:
print getheadersonly("http://ms.com", redirections = False)['finalurl']
print getheadersonly("http://ms.com")['finalurl']
它给morganstanley.com在这两种情况下。这里有什么问题?
可能重复[如何防止Python的urllib的(2)以下的重定向(http://stackoverflow.com/questions/554446/how-do-i-prevent-pythons-urllib2-from -follow-a-redirect) – bernie 2012-03-27 16:59:43