导入类似字典的数据转换成熊猫

我有一些写在一个类似字典的格式的数据文件：导入类似字典的数据转换成熊猫

{"score": [0.9995803236961365, 0.00041968212462961674], "key": "Am2mVTMbhd0y", "label": "0"} 
{"score": [0.9997120499610901, 0.00028794570243917406], "key": "AmG8StB8hM2k", "label": "0"} 
{"score": [0.8841496109962463, 0.11585044860839844], "key": "Alt137zv2nY6", "label": "0"} 
{"score": [0.9999467134475708, 5.334055458661169e-05], "key": "AmGdF7cY4X22", "label": "0"}

我想要做的就是将它们导入到大熊猫，与列作为“关键'，'标签'和'分数'，并且必须将两个数字值放在单独的列中。我已经尝试导入文件作为字典，但我得到：

ValueError: too many values to unpack

有关如何解决此问题的任何建议？

来源

2017-04-24 Brian O' Halloran

这个错误occour因为你的文件可能包含一些错误这是不符合字典格式 –

我认为你需要参数lines=True在read_json：

df = pd.read_json('file.json', lines=True) 
print (df) 
      key label           score 
0 Am2mVTMbhd0y  0 [0.999580323696136, 0.00041968212462900004] 
1 AmG8StB8hM2k  0 [0.9997120499610901, 0.00028794570243900004] 
2 Alt137zv2nY6  0  [0.8841496109962461, 0.11585044860839801] 
3 AmGdF7cY4X22  0 [0.99994671344757, 5.3340554586611695e-05] 

print (type(df['score'].iat[0])) 
<class 'list'>

对于转换lists到列使用DataFrame构造与concat：

df = pd.concat([df.drop('score', 1), 
       pd.DataFrame(df['score'].values.tolist()).add_prefix('score')], axis=1) 
print (df) 
      key label score0 score1 
0 Am2mVTMbhd0y  0 0.999580 0.000420 
1 AmG8StB8hM2k  0 0.999712 0.000288 
2 Alt137zv2nY6  0 0.884150 0.115850 
3 AmGdF7cY4X22  0 0.999947 0.000053

来源

2017-04-24 13:44:34 jezrael

完美！谢谢！ –

import pandas as pd 

#add your data in a list 
data = [{"score": [0.9995803236961365, 0.00041968212462961674], "key": "Am2mVTMbhd0y", "label": "0"}, 
{"score": [0.9997120499610901, 0.00028794570243917406], "key": "AmG8StB8hM2k", "label": "0"}, 
{"score": [0.8841496109962463, 0.11585044860839844], "key": "Alt137zv2nY6", "label": "0"}, 
{"score": [0.9999467134475708, 5.334055458661169e-05], "key": "AmGdF7cY4X22", "label": "0"}] 
#create dataframe 
df = pd.DataFrame(data)

来源

2017-04-24 13:42:21 Allen

导入类似字典的数据转换成熊猫

回答

相关问题