问题访问解读为S3对象CSV文件中的特定列与boto3

我从S3使用boto3读csv文件，并希望访问csv的特定列。我有这样的代码，我读csv文件转换成S3对象使用boto3但我有在访问特定列出来的麻烦：问题访问解读为S3对象CSV文件中的特定列与boto3

import boto3 

s3 = boto3.resource('s3',aws_access_key_id = keyId, aws_secret_access_key = sKeyId) 

obj = s3.Object(bucketName, srcFileName) 

filedata = obj.get()["Body"].read() 
print(filedata.decode('utf8')) 

for row in filedata.decode('utf8'): 
    print(row[1]) # Get the column at index 1

当我执行这个上面print(filedata.decode('utf8'))打印以下我的输出控制台上：

51350612,Gary Scott 
10100063,Justin Smith 
10100162,Annie Smith 
10100175,Lisa Shaw 
10100461,Ricardo Taylor 
10100874,Ricky Boyd 
10103593,Hyman Cordero

但行内print(row[1])循环for抛出误差IndexError: string index out of range。

如何删除这个错误和访问特定的列走出S3使用`boto3 csv文件吗？

来源

2017-06-02 user2966197

从CSV中正确读取，导入CSV Python模块，并使用它的读者之一。

文档：https://docs.python.org/2/library/csv.html

来源

2017-06-02 01:51:54

我知道如何读取csv文件。问题是使用'boto3'包从amazon s3读取一个csv文件，然后访问一个列出来的错误列 – user2966197

boto3.s3.get（）阅读（）将获取整个文件的字节对象。您的代码filedata.decode('utf8')仅将整个字节对象转换为String对象。这里没有解析发生。这是从另一个答案from another answer无耻的副本。

import csv 
# ...... code snipped .... insert your boto3 code here 

# Parse your file correctly 
lines = response[u'Body'].read().splitlines() 
# now iterate over those lines 
for row in csv.DictReader(lines): 
    # here you get a sequence of dicts 
    # do whatever you want with each line here 
    print(row)

如果你只是有一个简单的CSV文件，一个快速和肮脏的修复会做

for row in filedata.decode('utf8').splitlines(): 
    items = row.split(',') 
    print(items[0]. items[1])

How do I read a csv stored in S3 with csv.DictReader?

来源

2017-06-02 08:27:11 mootmoot

问题访问解读为S3对象CSV文件中的特定列与boto3

回答

相关问题