2012-03-17 85 views
2

假设我有一些.csv数据是这样的:如何用分数创建一个字典/查询列表,然后随机选择一些来添加分数?

query, score1, score2, score3 

kobe bryant,0,3,1, 
ccny,1,1,2, 
lego,3,1,0, 
disney,4,0,0, 
power rangers,2,0,2, 
britney spears,2,0,2, 
backstreet boys,2,1,1, 
soccer,3,0,1, 
justin beaver,2,0,2, 
new york knicks,2,1,1 

加起来我希望能得到类似的分数后:

score1 = 10; score2 = 4; score3 18; 

如何去分割这个和添加他们吗?

这是我到目前为止有:

import random 

def getScores(): 
    # open files to read 
    web = open("page.txt", "r"); 
    img = open("image.txt", "r"); 

    # scores for each search engine results 
    gScore = 0; 
    bScore = 0; 
    yScore = 0; 

    webDict = []; 
    imgDict = []; 

    # split by ',' 
    tmp = img.read().split(","); 
for i in range(0, len(tmp)-4, 4): 
     gScore = gScore + int(tmp[i+1]); 
     bScore = bScore + int(tmp[i+2]); 
     yScore = yScore + int(tmp[i+3]); 

    print "gScore is: ", gScore, "\n"; 
    print "bScore is: ", bScore, "\n"; 
    print "yScore is: ", yScore, "\n"; 

    tmp = web.read().split(","); 
    for i in range(0, len(tmp)-4, 4): 
     gScore = gScore + int(tmp[i+1]); 
     bScore = bScore + int(tmp[i+2]); 
     yScore = yScore + int(tmp[i+3]); 

print "gScore is: ", gScore, "\n"; 
    print "bScore is: ", bScore, "\n"; 
    print "yScore is: ", yScore, "\n"; 

if __name__ == "__main__": 
    getScores(); 

这将添加了所有的分数,但我无法建立从数据的字典。

我的意思是这样的:

bigList = [ 'query':{score1:int, score2:int, score3:int}, 'query2':{score1:int, score2:int, score3:int}... and so on]; 
+0

@Marcin编辑代码和更多详细信息 – iCodeLikeImDrunk 2012-03-17 22:29:54

+2

好,现在你可以完成你不知道如何完成的任务(用字典做些什么?),并留下余下的部分?没有人想读你的家庭作业,所以问一个更好的问题是获得有用答案的好方法。 – alexis 2012-03-17 22:42:23

回答

3

一旦你在逗号分割它,它可以很容易地在单线处理:

gScore, bScore, yScore = 
      [sum(map(int, scores)) for scores in (data[n::4] for n in range(1, 4))] 

data[::4]部分以每4个项目从数据中,从每种类型分数的适当偏移量开始。然后,您将每种类型转换为整数并对其进行总结。

1

我会用逗号第一分割字符串:

stuff = 'kobe bryant,0,3,1,ccny,1,1,2,lego,3,1,0,disney,4,0,0,power rangers,2,0,2,britney spears,2,0,2,backstreet boys,2,1,1,soccer,3,0,1,justin beaver,2,0,2,new york knicks,2,1,1' 
parts = stuff.split(',') 

len(parts)应该是4的倍数,否则你可以扔掉一个例外:

if len(parts)%4: 
    raise ValueError('bad csv') 

然后做类似:

d = {'score1': 0, 'score2': 0, 'score3': 0} 
for i in range(len(parts)/4): 
    d['score1'] += int(parts[4*i+1]) 
    d['score2'] += int(parts[4*i+2]) 
    d['score3'] += int(parts[4*i+3]) 

print d 

我得到

{'score1': 21, 'score2': 7, 'score3': 12} 
+2

你不需要循环:'sum(parts [1 :: 4])'等等,甚至可能是'dict((“score%d”%n,sum(parts [i + 1 :: 4]))因为我在(1,2,3))'(未经测试)。 – WolframH 2012-03-17 22:43:46

+1

感谢提醒有关这种奇特的扩展切片符号。它有时非常方便。 – 2012-03-17 22:51:58

+1

@WolframH我的答案显示了如何仅通过隐式循环来实现:) – agf 2012-03-17 22:52:20

相关问题