2014-12-07 128 views
0

从12.04移到Ubuntu 14.04后,我开始遇到很多从S3下载文件的问题。在大约1/20的情况下,boto在抛出异常之前无法下载文件并拦截1-2分钟。亚马逊S3 - 博托下载失败

不适用于非常小的文件,仅适用于中型和大型文件。

我写了一个简单的Python脚本来测试这一点:

import datetime 
from boto.s3.connection import S3Connection 

success = 0 
for i in xrange(1000000): 
    try: 
     start = datetime.datetime.now() 
     s3conn = S3Connection(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) 
     bucket = s3conn.get_bucket(bucket_name) 
     key = bucket.get_key(path) 
     content = key.get_contents_as_string() 
     delta = datetime.datetime.now() - start 
     print 'Downloading completed in', delta.total_seconds(), 's, file size is', len(content), 'bytes' 
     success += 1 
     print 'Downloaded', i + 1, 'files, success rate: ', float(success)/(i + 1) 
    except Exception as exc: 
     print 'Error occurred:', exc 

这里是我的Ubuntu 14.04的机器这个脚本的一些输出:

Downloading completed in 1.76665 s, file size is 996320 bytes 
Downloaded 1 files, success rate: 1.0 
Downloading completed in 7.709181 s, file size is 996320 bytes 
Downloaded 2 files, success rate: 1.0 
Downloading completed in 1.762192 s, file size is 996320 bytes 
Downloaded 3 files, success rate: 1.0 
Downloading completed in 7.670499 s, file size is 996320 bytes 
Downloaded 4 files, success rate: 1.0 
Downloading completed in 1.806259 s, file size is 996320 bytes 
Downloaded 5 files, success rate: 1.0 
Downloading completed in 1.992967 s, file size is 996320 bytes 
Downloaded 6 files, success rate: 1.0 
... 
... 
... 
Downloading completed in 6.496797 s, file size is 996320 bytes 
Downloaded 21 files, success rate: 1.0 
Error occurred: [Errno 104] Connection reset by peer 
Downloading completed in 2.31506 s, file size is 996320 bytes 
Downloaded 23 files, success rate: 0.95652173913 
Error occurred: The read operation timed out 
Error occurred: The read operation timed out 
Downloading completed in 1.963559 s, file size is 996320 bytes 
Downloaded 26 files, success rate: 0.884615384615 
Downloading completed in 1.395313 s, file size is 996320 bytes 
Downloaded 27 files, success rate: 0.888888888889 
Downloading completed in 1.416122 s, file size is 996320 bytes 
Downloaded 28 files, success rate: 0.892857142857 
Downloading completed in 1.168238 s, file size is 996320 bytes 
Downloaded 29 files, success rate: 0.896551724138 
Downloading completed in 1.30582 s, file size is 996320 bytes 
Downloaded 30 files, success rate: 0.9 

我试图在Windows和Mac坐在这个脚本在同一个本地网络,结果是100%的罚款!另外,我在我的12.04 Amazon EC2实例上没有问题:

... 
Downloading completed in 2.015681 s, file size is 996320 bytes 
Downloaded 100 files, success rate: 1.0 

有没有人遇到过类似的问题?我在哪里看?我试图调试boto库,但没有成功。 重要的是,当我在这台机器上使用其他文件下载方法时,我没有下载问题,只有boto失败。 试过不同的博托版本:2.15.0和2.34.0

回答

0

原来这与boto无关,因为我后来能够用curl重现它。

通过将数据从欧洲S3区域移动到“美国标准”区域来修复自己的问题,但仍然对如何以这种方式工作感兴趣。所有文件在本地网络中的一台机器上和另一台机器上完全下载 - 10-20%的故障。

如果这会让我更加困扰,请向亚马逊解决此问题。

0

创建连接时,应该指定区域,否则可能会超时,因为它可能会尝试其他区域。

conn = boto.s3.connect_to_region(aws_region, **creds) 

其中aws_region是一个字符串,creds是您的凭据的字典。