Python HTTP HEAD - 正确处理重定向？

我可以使用的urllib2做HEAD请求，就像这样： Python HTTP HEAD - 正确处理重定向？

import urllib2 
request = urllib2.Request('http://example.com') 
request.get_method = lambda: 'HEAD' 
urllib2.urlopen(request)

的问题是，它似乎是，当这个进行重定向，它使用GET而不是HEAD。

此HEAD请求的目的是检查我即将下载的URL的大小和内容类型，以便确保我不下载一些大文档。（URL由随机互联网用户通过IRC提供）。

如何在重定向后使用HEAD请求？

来源

2012-04-01 Krenair

[要求]（http://docs.python-requests.org/en/latest/index.html）至少声称这样做的正确的方式（至少，它将重定向行为记录为幂等方法，并在文档中专门调用HEAD）。 – 2012-04-01 19:41:25

类似的解决方案：http://stackoverflow.com/questions/9890815/python-get-headers-only-using-urllib2/9892207#9892207 – newtover 2012-04-01 21:00:21

好问题！如果您使用的是urllib2，那么您需要查看this answer了解您自己的重定向处理程序的构建。

在短（读：从以前的答案公然被盗）：

import urllib2 

#redirect_handler = urllib2.HTTPRedirectHandler() 

class MyHTTPRedirectHandler(urllib2.HTTPRedirectHandler): 
    def http_error_302(self, req, fp, code, msg, headers): 
     print "Cookie Manip Right Here" 
     return urllib2.HTTPRedirectHandler.http_error_302(self, req, fp, code, msg, headers) 

    http_error_301 = http_error_303 = http_error_307 = http_error_302 

cookieprocessor = urllib2.HTTPCookieProcessor() 

opener = urllib2.build_opener(MyHTTPRedirectHandler, cookieprocessor) 
urllib2.install_opener(opener) 

response =urllib2.urlopen("WHEREEVER") 
print response.read() 

print cookieprocessor.cookiejar

而且，在勘误表中提到，您可以使用Python Requests。

来源

2012-04-01 19:43:07 MrGomez

我结束了使用这个重定向处理程序，根据你发现：http：/ /pastebin.com/m7aN21A7 谢谢！ – Krenair 2012-04-01 20:59:27

@Krenair很高兴帮助！ – MrGomez 2012-04-01 21:02:57

您可以用requests库做到这一点：

>>> import requests 
>>> r = requests.head('http://github.com', allow_redirects=True) 
>>> r 
<Response [200]> 
>>> r.history 
[<Response [301]>] 
>>> r.url 
u'https://github.com/'

来源

2012-04-01 19:43:35 jterrace

Python HTTP HEAD - 正确处理重定向？

回答

相关问题