2017-05-25 135 views
-2
import json 
import csv 
from watson_developer_cloud import NaturalLanguageUnderstandingV1 
import watson_developer_cloud.natural_language_understanding.features.v1 as \ 
    features 


natural_language_understanding = NaturalLanguageUnderstandingV1(
    version='2017-02-27', 
    username='b6dd1781-02e4-4dca-a706-05597d574221', 
    password='c3ked6Ttmmc1') 

response = natural_language_understanding.analyze(
    text='Bruce Banner is the Hulk and Bruce Wayne is BATMAN! ' 
     'Superman fears not Banner, but Wayne.', 
    features=[features.Entities()]) 

response1 = natural_language_understanding.analyze(
    text='Bruce Banner is the Hulk and Bruce Wayne is BATMAN! ' 
     'Superman fears not Banner, but Wayne.', 
    features=[features.Keywords()]) 

#print response.items()[0][1][1] 
make= json.dumps(response, indent=2) 
make1= json.dumps(response1, indent=2) 
print make 
print make1 

x = json.loads(make) 

f = csv.writer(open("Entities.csv", "wb+")) 


f.writerow(["relevance", "text", "type", "count"]) 

for x1 in x: 
    f.writerow([x1['relevance'], 
       x1['text'], 
       x1['type'], 
       x1['count']]) 

上面的make变量包含一个必须转换为CSV的JSON,并且这样做时我得到一个类型为TypeError的错误:字符串索引必须是整数。实际的问题是我无法通过实体并获得关键值对,有人可以告诉我在这里可以做些什么? JSON将JSON转换为CSV

{ 
    "entities": [ 
    { 
     "relevance": 0.931351, 
     "text": "Bruce Banner", 
     "type": "Person", 
     "count": 3 
    }, 
    { 
     "relevance": 0.288696, 
     "text": "Wayne", 
     "type": "Person", 
     "count": 1 
    } 
    ], 
    "language": "en" 
} 
+1

请包括产生短节目你所描述的错误。请包括您的实际和预期的程序输出。 –

+0

你可以把数据放在excel中,并记录将该数据解析成.csv的宏然后你可以将该脚本转换成python等等...... – DeerSpotter

回答

0

如果转储JSON结构法和数据到一个文件 - 你可以使用这个脚本来处理的关键是:值到CSV文件中:

# -*- coding: utf-8 -*- 
""" 
Created on Fri May 26 01:24:44 2017 

@author: ITZIK CHAIMOV 
""" 
import csv 


labels = []  #prepare empty list of labels and values 
values = [] 

fin = open('dataFile.json', 'r') #assuming you have dumped the data into a json file (as you showed at the example) 
#numberOfLines = fin.readlines() 
#for line in range(numberOfLines): 
buffer = fin.readline() 
buffer = fin.readline() 
while (buffer!=''): 
    while not(buffer.__contains__('"en"')): 
     if buffer.__contains__('{'): 
      buffer = fin.readline() 
      while not(buffer.__contains__('}')): 
       labels.append(buffer.split(':')[0].strip()) 
       values.append(buffer.split(':')[1].strip()) 
       buffer = fin.readline() 
     buffer=fin.readline() 
    break 
fin.close() 
n=size(labels) 
firstLabel = labels[0] 
i=0 
for lbl in labels: 
    if ((firstLabel == lbl) & (i!=0)): 
     break 
    i+=1 

tbl = [] 
tbl.append(labels[0:i]) 
for j in range(int(n/i)): 
    tbl.append(values[j*i:(j+1)*i]) 


fout = open('testfile.csv', 'w') 
csv_write = csv.writer(fout) 
csv_write.writerows(tabl) 
fout.close() 

CSV file shown at Excel - the '/" signs can be removed

-1

x1

结构返回结构x的密钥。要访问与每个密钥关联的值,您需要执行x[x1],否则,您正在寻找x1中名为'relevance'的索引,该索引是string类型的键。

x包含整个JSON结构。您只对由“实体”关键字索引的列表(由单个字典组成)感兴趣。所以你首先只能访问它,然后通过每个键值对。

x1 = x['entities'][0] 
f.writerow([x1['relevance'], 
       x1['text'], 
       x1['type'], 
       x1['count']]) 

第二个关键是'语言',它返回一个字符串'en',而不是一个字典。

+0

你能告诉我如何通过编辑代码来完成它吗?我理解你的逻辑,但可以理解如何编码 –

+0

看看我编辑的答案。 – Antimony