2014-11-21 149 views
0

这里是我的代码: 创建一个类,rit_object是一个私有的类比对参数具有类型:字典映射蟒蛇

class YearCount(rit_object): 
    __slots__ = ('year', 'count') 
    _types = (int, int) 

返回YearCount对象:

def createYearCount(year, count): 
    return YearCount(year, count) 

通读文件。输出应该类似于:

import wordData 
words = wordData.readWordFile(’very_short.csv’) 
print(words) 
{’airport’: [YearCount(year=2007, count=175702), YearCount(year=2008, 
count=173294)], ’wandered’: [YearCount(year=2005, count=83769), 
YearCount(year=2006, count=87688), YearCount(year=2007, count=108634), 
YearCount(year=2008, count=171015)], ’request’: [YearCount(year=2005, 
count=646179), YearCount(year=2006, count=677820), YearCount(year=2007, 
count=697645), YearCount(year=2008, count=795265)]} 

readWordFile(文件名):

def readWordFile(fileName): 
    #read in the entire unigram dataset 

    words = {} 
    for line in fileName: 
     new = line.split(', ') 
     print(new) 
     id = new[0] 
     print(id) 
     yc = createYearCount(int(new[1]), int(new[2])) 
     # add to list or create a new list 
     if not id in words: 
      words[id] = [yc] 
     else: 
      words[id].append(yc) 
    print(words) 

如果从我的readWordFile我总出现用途“字”,是我totaloccurences功能corrctly工作对生产总数每年?

def totalOccurences(word, words): 
    count = 0 
    if words[id] == word: 
     count += YearCount.count 
    return count 

文本文件:

airport, 2007, 175702 
airport, 2008, 173294 
request, 2005, 646179 
request, 2006, 677820 
request, 2007, 697645 
request, 2008, 795265 
wandered, 2005, 83769 
wandered, 2006, 87688 
wandered, 2007, 108634 
wandered, 2008, 171015 

回答

1

totalOccurences您使用的变量id但它不是在函数中的任何地方定义:if words[id] == word。我认为你要做的是总结所有字数在words[word]之内。因此,该函数将变成:

def totalOccurences(word, words): 
    if word not in words: 
     return 0 
    count = 0 
    for item in words[word]: 
     count += item.count 
    return count 

如果单词不存在words那么函数返回0。否则,它会越过元素words[word](这是一个列表),它会加起来所有的.count值。然后,您将在words[word]中给出word的总次数。