Python的ValueError异常：未知的URL类型：空间

我使用使用Spyder的3.0批量下载的文本文件在Python 2.7的urllib2模块通过读取包含这些列表的文本文件（？）：Python的ValueError异常：未知的URL类型：空间

reload(sys) 
    sys.setdefaultencoding('utf-8') 
    with open('ocean_not_templated_url.txt', 'r') as text: 
     lines = text.readlines() 
     for line in lines: 
      url = urllib2.urlopen(line.strip('ïÃ¯Â»Â¿ \xa0\t\n\r\v')) 
      with open(line.strip('\n\r\t ').replace('/', '!').replace(':', '~'), 'wb') as out: 
       for d in url: 
        out.write(d)

我已经发现了一串奇怪的字符，我已经因为剥离，然而，脚本的时候几乎完成了90％，给下面的错误失败的URL：

我认为这是一个不间断的空间（用\ xa0 in表示）代码），但仍然失败。有任何想法吗？

来源

2017-02-17 snl330

这是一个奇怪的URL！

指定通过网络的通信协议。如果文件存在于WWW上，请尝试在URL前加上http://和域名。

文件总是驻留在某个地方，某些服务器的目录中或本地系统上。因此必须有一个网络路径这样的文件，例如：

http://127.0.0.1/folder1/samuel/file1.txt

同一示例中，与本地主机是用于127.0.0.1（通常）

http://localhost/folder1/samuel/file1.txt

别名

这可能会解决问题。试想想，您的文件存在，应该如何解决？

更新：

我尝试了不少这一点。我想我知道为什么会出现这种错误！：D

I speculate that your file which stores the URL's actually has a sneakyempty line near the end. I can say it's near the end as you said that it executes about 90% of it and then fails. So, the python urllib2 function get_type is unable to process that empty url and throws unknown url type:

我认为这就是问题！删除文件ocean_not_templated_url.txt中的空行并尝试一下！

只需检查，让我知道！：P

来源

2017-02-17 19:46:52 varun

嗯..我应该在哪里指定协议？顺便提一下，谢谢你的建议。 – snl330

@Samuel我已经更新了答案。一探究竟！ – varun

我明白了。谢谢你的详细解答！前缀“http：//”已经存在于文本文件的URL列表中，例如：http://www1.ncdc.noaa.gov/pub/data/paleo/paleocean/sediment_files/complete/e49-23 -tab.txt'。（它们在ftp服务器上。）是否有可能这些URL不再使用，导致问题？我知道他们已经被我们的数据管理员调动过了，有些还很古老。再次感谢。 – snl330

Python的ValueError异常：未知的URL类型：空间

回答

相关问题