2014-12-05 131 views
1

我使用TF-IDF查询Google+数据并将数据保存为JSON文件。在处理这个文件时,我得到一个错误。python-TypeError:字符串索引必须是整数。怎么修?

代码

import json 
import nltk 

DATA = 'C:/Users/Dung Ring/Desktop/kpdl/107033731246200681024.json' 
data = json.loads(open(DATA).read()) 

QUERY_TERMS = ['SOPA'] 

activities = [activity['object']['content'].lower().split() \ 
      for activity in data \ 
      if activity['object']['content'] != " "] 

# TextCollection provides tf, idf, and tf_idf abstractions so 
# that we don't have to maintain/compute them ourselves 

tc = nltk.TextCollection(activities) 

relevant_activities = [] 

for idx in range(len(activities)): 
    score = 0 
    for term in [t.lower() for t in QUERY_TERMS]: 
     score += tc.tf_idf(term, activities[idx]) 
    if score > 0: 
     relevant_activities.append({'score': score, 'title': data[idx]['title'], 
          'url': data[idx]['url']}) 

# Sort by score and display results 

relevant_activities = sorted(relevant_activities, key=lambda p: p['score'], reverse=True) 
for activity in relevant_activities: 
     print activity['title'] 
     print '\tLink: %s' % (activity['url'],) 
     print '\tScore: %s' % (activity['score'],) 
     print 

错误消息

Traceback (most recent call last): 
    File "ex9.py", line 11, in <module> 
    if activity['object']['content']!= ""] 
TypeError: string indices must be integers 

我使用Python 2.7。

+1

如果你想要比我的答案更具体的东西,你将不得不提供我们和你的json文件的例子。 – LeartS 2014-12-05 09:35:34

+0

我们是否应该猜测你的数据是什么样的?唯一明显的一点是,'activity'或'activity ['object']'是一个字符串。 – 2014-12-05 09:57:51

+0

这是我的json文件。链接:https://www.dropbox.com/s/rl72mwmi4rifqjq/107033731246200681024.json?dl = 0 – 2014-12-05 10:04:11

回答

0

activityactivity['object']是一个字符串,而不是您所期望的字典。打印data并检查。

相关问题