python从网址中读取文件

我的剧本至今是：

这样一来，我可以对文件的工作。但是，当我尝试存储该文件时（在webFile中），我只能获取到该套接字的链接。我尝试另一种解决方案是使用read()

webFile = urllib.urlopen(currURL).read()

然而，这似乎删除格式化（\n，\t等）将被删除。

如果我打开这样的文件：

for line in webFile: 
    print line

这应导致：

"this" 
"is" 
"a" 
"textfile"

，但我得到

webFile = urllib.urlopen(currURL)

我可以逐行阅读：

't' 
'h' 
'i' 
...

我希望在我的电脑上获取该文件，但同时保持该格式。

来源

2015-10-06 mat

http://stackoverflow.com/questions/22676/how-do-i-download-a-file-over-http-using-python。只需要webFile并将其写入文件。 – postelrich

有没有办法做到这一点，而不是先写它到本地文件？ – mat

您应该使用readlines方法（）读取整行：

response = urllib.urlopen(currURL) 
lines = response.readlines() 
for line in lines: 
    . 
    .

但是，我强烈建议你使用requests库。这里的链接

来源

2015-10-06 14:02:37

readline为我做了诀窍，ty – mat

这是因为你迭代了一个字符串。这将导致字符打印的字符。

为什么不一次保存整个文件？

import urllib 
webf = urllib.urlopen('http://stackoverflow.com/questions/32971752/python-read-file-from-web-site-url') 
txt = webf.read() 

f = open('destination.txt', 'w+') 
f.write(txt) 
f.close()

如果你真的想遍历文件中的行线路使用txt = webf.readlines()和迭代这一点。

来源

2015-10-06 14:00:42 Noxeus

如果您只是试图将远程文件保存为本地服务器作为python脚本的一部分，则可以使用PycURL库下载并保存，而不必解析它。这里更多的信息 - http://pycurl.sourceforge.net

另外，如果你想读，然后写输出，我觉得你刚刚走出序列的方法。请尝试以下操作：

# Assign the open file to a variable 
webFile = urllib.urlopen(currURL) 

# Read the file contents to a variable 
file_contents = webFile.read() 
print(file_contents) 

> This will be the file contents 

# Then write to a new local file 
f = open('local file.txt', 'w') 
f.write(file_contents)

如果两者都不适用，请更新问题以进行说明。

来源

2015-10-06 14:02:15

python从网址中读取文件

回答

相关问题