0

我是python(PYTHON 2.7)的新手,我试图运行计算皮尔逊相关性的程序。该代码是从“集体智慧” 当我输入的功能和运行Pearson相关在Python中运行皮尔逊相关性得分时出错

我收到此错误:

>>> sim_pearson(critics, 
... 'Lisa Rose','Gene Seymour') 
Traceback (most recent call last): 
    File "<stdin>", line 2, in <module> 
    File "recommendations.py", line 49, in sim_pearson 
    sum1=sum([prefs[p1][it]] for it in si) 
TypeError: unsupported operand type(s) for +: 'int' and 'list' 
>>> 

的代码是在这里

#a dictionary of movie critics and their ratings of a small set of movies 
critics={'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on the Plane':3.5, 'Just My Luck': 1.5, 
         'superman returns': 5.0, 'You, Me and Dupree': 3.5}, 'Gene Seymour':{ 
         'Lady in the water':1.0,'Snakes on the Plane':3.5, 
         'superman returns':5.0, 'You, Me and Dupree':3.5}, 'Michale Philllips':{ 
         'Lady in the Water': 2.5, 'Snakes on the Plane':3.0, 'superman returns': 3.5, 
         'The Night Listenr': 4.0}, 'Cludia Puig':{'Snakes on the Plane':3.5, 'Just My Luck': 3.0, 
         'The Night Listenr': 4.5, 'superman returns': 4.0, 'You, Me and Dupree': 2.5}, 
         'Mick LaSalle':{'Lady in the Water': 3.0, 'Snakes on the Plane':4.0, 'Just My Luck': 2.0, 
         'The Night Listenr': 3.0, 'superman returns': 3.0, 'You, Me and Dupree': 2.0}, 
         'Jack Matthews': {'Lady in the Water': 3.0, 'Snakes on the Plane':4.0, 
         'The Night Listenr': 3.0, 'superman returns': 5.0, 'You, Me and Dupree': 3.5}, 
         'Toby':{'Snakes on a Plane': 4.5, 'You, Me Dupree':1.0,'superman returns':4.0}} 
#Returns a distance-based similarity score for person1 and p 
def sim_distance(prefs,person1,person2): 
#get the list of shared_items 
    si={} 
    for item in prefs[person1]: 
     if item in prefs[person2]: 
      si[item]=1 

    #if they have no rating in common returns zero 
    if len(si)==0: 
     return 0 
    #Add up the squares of all the differences 

    sum_of_squares=sum(pow(prefs[person1][item]-prefs[person2][item],2) 
    for item in prefs[person1] if item in prefs[person2]) 
    return 1/(1+sum_of_squares) 

#returns the pearson correlation coefficient for p1 and p2 
def sim_pearson(prefs,p1,p2): 
    #get list of mutually rated items 
    si={} 
    for item in prefs[p1]: 
     if item in prefs[p2]: si[item]=1 
     #find the number of elements 
    n=len(si) 
#if they are no ratings in common, return 0 
    if n==0: return 0 
#add up all the preferences 
    sum1=sum([prefs[p1][it]] for it in si)#reported line 49 
      #^ 
    sum2=sum(prefs[p2][it] for it in si) 
    #sum up the squares 
    sum1sq=sum([pow(prefs[p1][it] for it in si)]) 
    sum2sq=sum([pow(prefs[p2][it] for it in si)]) 
    #sum up the products 
    pSum=sum([prefs[p1][it]*prefs[p2][it] for it in si]) 
    #calculate peason score 
    num=pSum-(sum1*sum2/n) 
    den=sqrt((sum1Sq-pow(sum1,2)/n)*(sum2Sq-pow(sum2,2)/n)) 
    r=num/den 
    return r 

回答

0

我想你想写

sum1=sum([prefs[p1][it] for it in si]) 

而不是

sum1=sum([prefs[p1][it]] for it in si) 

(参见括号)。 该错误表示您正在尝试使用列表sum整数。

+0

非常感谢@JulienD的修正,但我仍然得到这个错误'回溯(最近通话最后一个): 文件“”,2号线,在 文件“recommendations.py” 49行,在+)'int'和'list''不支持的操作数类型 >>> –

+0

'sum()'sum()'sum([prefs [p1] [it] in it]) '可以直接使用迭代器,而不必首先将迭代器变成列表,所以'sum1 = sum(prefs [p1] [它]在si中)'更清洁并且效率稍高。 – Neapolitan

+0

谢谢@Neapolitan我做了修改,不幸的是,这并没有解决问题 '文件 “”,2号线,在 文件 “recommendations.py” 49行,在sim_pearson SUM1 = SUM(首选项 TypeError:不支持的操作数类型为+:'int'和'list'' –