我有一个简单的2列的CSV和需要找到每个键即输入CSV平均寻找者皆平均值从csv在python

A,2 
B,3 
A,1 
C,2 
B,2 
D,4 
C,2

所需的输出

{'A': 1.5, 'B': 2.5, 'C': 2, 'D': 4}

到目前为止的代码：

pythoncsvfile = open("data.csv") 
csv_reader = csv.reader(csvfile, delimiter=',') 
for row in csv_reader: 
    print (row[0],row[1])

来源

2017-08-05 user8420144

这是一个很好的，明确的问题陈述。现在，尝试编写一些代码来实现它。如果您遇到困难，请告诉我们您卡在哪里以及为什么。 –

你有'熊猫'吗？ –

您是否考虑过适合的容器数据类型？ – wwii

选项A

使用csv

import csv 
import collections 

out = collections.defaultdict(list) 
with open('file.csv') as f: 
    for line in csv.reader(f): 
     out[line[0]].append(int(line[1])) 

for k in out: 
    out[k] = sum(out[k])/len(out[k]) 

print(dict(out)) 

{'A': 1.5, 'B': 2.5, 'C': 2.0, 'D': 4.0}

选项B

使用pandas

import pandas as pd 

df = pd.read_csv('file.csv', header=None, names=['Key', 'Value']) 
out = df.groupby('Key').mean() 

print(out.Value.to_dict()) 

{'A': 1.5, 'B': 2.5, 'C': 2.0, 'D': 4.0}

来源

2017-08-05 04:53:55

不错的解决方案，但我想知道你为什么设置'as_index = False'。如果你不这样做，你可以说'out.Value.to_dict（）'来获得请求的OP格式。 –

@JohnZwinck'Key'列成为索引。我不喜欢这看起来如何。真的，就是这样。：p –

@JohnZwinck但是，这是一个不错的主意:)谢谢。 –

我想你可以使用下面的代码：

import csv 
from collections import OrderedDict 

data = OrderedDict() 

with open('data.csv', 'rb') as csvfile: 
    content = csv.reader(csvfile, delimiter=',') 
    for index, value in content: 
     if (not data.has_key(index)): 
      #initialize 
      data[index] = {'times':1, 'total':float(value)} 
     else: 
      #index already present 
      data[index] = {'times': data[index]["times"]+1, 'total':data[index]["total"]+float(value)} 

def average(data): 
    results = OrderedDict() 

    for index, values in data.iteritems(): 
     results[index] = values["total"]/values["times"] 

    return results 

print average(data)

实例与数据结果：

OrderedDict([('A', 1.5), ('B', 2.5), ('C', 2.0), ('D', 4.0)])

HTH

来源

2017-08-05 05:07:59 Alberto

寻找者皆平均值从csv在python

回答

实例与数据结果：

相关问题