试图获取HTTP代码。有人可以在他们的Python解释器中为我尝试这段代码，看看它为什么不起作用吗？

import httplib 
def httpCode(theurl): 
    if theurl.startswith("http://"): theurl = theurl[7:] 
    head = theurl[:theurl.find('/')] 
    tail = theurl[theurl.find('/'):] 
    response_code = 0 
    conn = httplib.HTTPConnection(head) 
    conn.request("HEAD",tail) 
    res = conn.getresponse() 
    response_code = int(res.status) 
    return response_code

基本上，此功能需要一个URL，并返回其HTTP代码（200，404等）我得到的错误是：试图获取HTTP代码。有人可以在他们的Python解释器中为我尝试这段代码，看看它为什么不起作用吗？

Exception Value: (-2, 'Name or service not known')

我必须用这种方法去做。也就是说，我通常会传送大量的视频文件。我需要获取“标题”并获取HTTP代码。我不能下载文件，然后得到HTTP代码，因为它会花费太长时间。

Python 2.6.2 (release26-maint, Apr 19 2009, 01:58:18) 
[GCC 4.3.3] on linux2 
Type "help", "copyright", "credits" or "license" for more information. 
>>> import httplib 
>>> def httpCode(theurl): 
...  if theurl.startswith("http://"): theurl = theurl[7:] 
...  head = theurl[:theurl.find('/')] 
...  tail = theurl[theurl.find('/'):] 
...  response_code = 0 
...  conn = httplib.HTTPConnection(head) 
...  conn.request("HEAD",tail) 
...  res = conn.getresponse() 
...  response_code = int(res.status) 
...  print response_code 
... 
>>> httpCode('http://youtube.com') 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "<stdin>", line 7, in httpCode 
    File "/usr/lib/python2.6/httplib.py", line 874, in request 
    self._send_request(method, url, body, headers) 
    File "/usr/lib/python2.6/httplib.py", line 911, in _send_request 
    self.endheaders() 
    File "/usr/lib/python2.6/httplib.py", line 868, in endheaders 
    self._send_output() 
    File "/usr/lib/python2.6/httplib.py", line 740, in _send_output 
    self.send(msg) 
    File "/usr/lib/python2.6/httplib.py", line 699, in send 
    self.connect() 
    File "/usr/lib/python2.6/httplib.py", line 683, in connect 
    self.timeout) 
    File "/usr/lib/python2.6/socket.py", line 498, in create_connection 
    for res in getaddrinfo(host, port, 0, SOCK_STREAM): 
socket.gaierror: [Errno -2] Name or service not known 
>>>

来源

2010-01-07 TIMEX

您应该发布回溯，而不仅仅是错误消息。 – 2010-01-07 20:11:51

它适用于'http：// youtube.com /'（我得到一个301）。 – 2010-01-07 20:17:56

您的Python代码对我来说按预期工作。也许你的域名服务器失败了？尝试查看你的/ etc/hosts文件。 – sberry 2010-01-07 20:18:33

您的代码适用于我和另一个评论的人。这意味着您使用的URL会以某种方式导致解析问题。 head和tail都应该被检查以确定它认为主机是什么。例如：

head = theurl[:theurl.find('/')] 
print head 
tail = theurl[theurl.find('/'):] 
print tail

一旦你可以看到什么head和tail是，可以判断它是否真的应该能够解决head。例如，如果url是：

http://myhost.com:8080/blah/blah

它会因端口号而失败。

来源

2010-01-07 20:24:05

你能检查我的新编辑过的文章吗？我完全像'http://youtube.com' – TIMEX 2010-01-07 20:25:49

或者如果你遗漏了一个尾部的斜线，比如http://google.com，因为find会返回-1。头将是google.co和尾巴会是米。 – sberry 2010-01-07 20:27:15

这是你的问题，你需要一个尾随斜线。 – sberry 2010-01-07 20:28:07

正如Adam Crossland评论所建议的那样，您应该检查头部和尾部的值。在你的情况，不带后缀斜线你最终

head = "youtube.co" 
tail = "m"

string.find返回-1，如果没有找到，所以你抓住了尾巴所有但头部最后一个字符，只有最后一个字符。

来源

2010-01-07 20:33:04 sberry

亚历克斯，你应该强烈考虑使用urlparse。它将更好地处理URL：http://docs.python.org/library/urlparse.html – 2010-01-07 20:40:20

试图获取HTTP代码。有人可以在他们的Python解释器中为我尝试这段代码，看看它为什么不起作用吗？

回答

相关问题