尝试从图像url（使用python urllib）刮取图像，但获得html代替

我试图从以下url中获取图像。尝试从图像url（使用python urllib）刮取图像，但获得html代替

http://upic.me/i/fj/the_wonderful_mist_once_again_01.jpg

我可以做单击鼠标右键，另存为，但是当我试图用urlretrieve像

import urllib 
img_url = 'http://upic.me/i/fj/the_wonderful_mist_once_again_01.jpg' 
urllib.urlretrieve(img_url, 'cover.jpg')

我发现这是HTML而不是.JPG图像，但我不知道为什么。你能告诉我为什么我的方法不工作？是否有任何可以模仿右键单击save-as方法的选项？

来源

2015-04-03 Winnie

您可以使用Requests，如果你没有带装的是，pip install requests

因为这img_url由服务器到另一个HTML页面重定向（那是HTML如果您没有提供referer标题，则表示您刚刚下载的页面）。

因此，下面的代码首先找到重定向url，并将其添加到HTTP Referer头。

import requests 
img_url = 'http://upic.me/i/fj/the_wonderful_mist_once_again_01.jpg' 

r = requests.get(img_url, allow_redirects=False) # stop redirect 302 , capture redirects url 

headers = {} 
headers['Referer'] = r.headers['location'] # add this url to referer 'http://upic.me/show/55132055' 

r = requests.get(img_url, headers=headers) 
filename = img_url.split('/')[-1]    # find the file name in `img_url` 
with open(filename, 'wb') as fh:    # use 'wb' to write in binary mode 
    fh.write(r.content)

来源

2015-04-03 15:19:49 Aaron

谢谢，它真的有用！非常感谢您的明确解释。一位资深人士告诉我把'http'改成'https'，它也可以。不幸的是，我没有机会向他询问细节。 – Winnie 2015-04-03 15:46:59

尝试这样的：

import urllib2 

image = urllib2.urlopen('http://upic.me/i/fj/the_wonderful_mist_once_again_01.jpg').read() 
f = open('some_name.jpg','w') 
f.write(image) 
f.close()

来源

2015-04-03 14:14:04 Hackaholic

它也不工作。图像被破坏，但如果我将它保存为html，它是可读的，它实际上是html。 – Winnie 2015-04-03 14:29:00

尝试从图像url（使用python urllib）刮取图像，但获得html代替

回答

相关问题