2014-12-04 86 views
8

我只有S3访问S3存储桶中的特定目录。Python boto,列出桶中特定目录的内容

例如,与s3cmd命令,如果我尝试列出全斗:

$ s3cmd ls s3://my-bucket-url 

我得到一个错误:Access to bucket 'my-bucket-url' was denied

但是,如果我尝试在斗访问特定的目录,我可以看内容:

$ s3cmd ls s3://my-bucket-url/dir-in-bucket 

现在我想用python boto连接到S3存储桶。与之相似有:

bucket = conn.get_bucket('my-bucket-url') 

我得到一个错误:boto.exception.S3ResponseError: S3ResponseError: 403 Forbidden

但如果我尝试:

bucket = conn.get_bucket('my-bucket-url/dir-in-bucket') 

脚本摊位约10秒钟,之后打印出一个错误。波纹管是完整的痕迹。任何想法如何继续这个?

Traceback (most recent call last): 
    File "test_s3.py", line 7, in <module> 
    bucket = conn.get_bucket('my-bucket-url/dir-name') 
    File "/usr/local/lib/python2.7/dist-packages/boto/s3/connection.py", line 471, in get_bucket 
    return self.head_bucket(bucket_name, headers=headers) 
    File "/usr/local/lib/python2.7/dist-packages/boto/s3/connection.py", line 490, in head_bucket 
    response = self.make_request('HEAD', bucket_name, headers=headers) 
    File "/usr/local/lib/python2.7/dist-packages/boto/s3/connection.py", line 633, in make_request 
    retry_handler=retry_handler 
    File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 1046, in make_request 
    retry_handler=retry_handler) 
    File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 922, in _mexe 
    request.body, request.headers) 
    File "/usr/lib/python2.7/httplib.py", line 958, in request 
    self._send_request(method, url, body, headers) 
    File "/usr/lib/python2.7/httplib.py", line 992, in _send_request 
    self.endheaders(body) 
    File "/usr/lib/python2.7/httplib.py", line 954, in endheaders 
    self._send_output(message_body) 
    File "/usr/lib/python2.7/httplib.py", line 814, in _send_output 
    self.send(msg) 
    File "/usr/lib/python2.7/httplib.py", line 776, in send 
    self.connect() 
    File "/usr/lib/python2.7/httplib.py", line 1157, in connect 
    self.timeout, self.source_address) 
    File "/usr/lib/python2.7/socket.py", line 553, in create_connection 
    for res in getaddrinfo(host, port, 0, SOCK_STREAM): 
socket.gaierror: [Errno -2] Name or service not known 
+0

也许你应该在你的脚本中使用my-bucket-url/dir-in-bucket而不是'my-bucket-url/my-bucket-url'? – 2014-12-04 10:57:38

+0

抱歉,尝试删除实际的存储分区名称和目录名称时出错。 – 2014-12-04 12:22:00

回答

16

默认情况下,当你在博托做一个get_bucket调用它试图验证你确实有访问该桶通过对斗网址HEAD请求。在这种情况下,您不希望boto这样做,因为您无权访问存储桶本身。那么,这样做:

bucket = conn.get_bucket('my-bucket-url', validate=False) 

,然后你应该能够做这样的事情,列出对象:

for key in bucket.list(prefix='dir-in-bucket'): 
    <do something> 

如果仍然收到403 Errror,尝试在末尾添加斜线前缀。

for key in bucket.list(prefix='dir-in-bucket/'): 
    <do something> 
+0

谢谢,这对我有用,我只需要在桶名称末尾添加一个斜杠('/'),否则我仍然有403错误。 – 2014-12-04 13:04:45

+0

是的,这是有道理的。我批准你的编辑到我的例子。很高兴为你工作。 – garnaat 2014-12-04 13:18:59

+0

为什么需要尾随“/”?我可以证实,在我的例子中它是必需的,但我找不到它的文档。 – dbn 2016-12-13 00:34:53

0

如果要列出存储桶中文件夹的所有对象,可以在列表中指定它。

import boto 
conn = boto.connect_s3(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) 
bucket = conn.get_bucket(AWS_BUCKET_NAME) 
for file in bucket.list("FOLDER_NAME/", "/"): 
    <do something with required file> 
+0

OP提到'get_bucket'给他一个403 – ChrisWue 2017-03-28 01:06:13