2017-07-07 80 views
2

我不得不关注来自API(例如my_json)的JSON。实体的数组存储在一个关键称为实体:将JSON导入熊猫

{ 
    "action" : "get", 
    "application" : "4d97323f-ac0f-11e6-b1d4-0eec2415f3df", 
    "params" : { 
     "limit" : [ "2" ] 
    }, 
    "path" : "/businesses", 
    "entities" : [ 
     { 
      "uuid" : "508d56f1-636b-11e7-9928-122e0737977d", 
      "type" : "business", 
      "size" : 730 }, 
     { 
      "uuid" : "2f3bd4dc-636b-11e7-b937-0ad881f403bf", 
      "type" : "business", 
      "size" : 730 
     } ], 
    "timestamp" : 1499469891059, 
    "duration" : 244, 
    "count" : 2 
} 

我试图将其加载到数据帧如下:

import pandas as pd 

pd.read_json(my_json['entities'], orient='split') 

我收到以下错误:

ValueError: Invalid file path or buffer object type: <type 'list'> 

我试过记录方向,但仍然无法正常工作。

+0

能否请你加'my_json'的内容,你的问题? – Infinity

回答

0

你使用的方式my_json['entities']使它看起来像是一个Python dict

根据pandas documentation,read_json接受“有效的JSON字符串或文件样”。难道可以将dict转换成JSON strinrg有以下几点:

import json 
json_str = json.dumps(my_json["entities"]) 

为你描述它不适合的格式战略orient="split"下的关键"entities"数据。它看起来像您将需要使用orient="list"

import pandas as pd 

my_json = """{ 
    "entities": [ 
      { 
       "type": "business", 
       "uuid": "199bca3e-baf6-11e6-861b-0ad881f403bf", 
       "size": 918 
      }, 
      { 
       "type": "business", 
       "uuid": "054a7650-b36a-11e6-a734-122e0737977d", 
       "size": 984 
      } 
     ] 
}""" 

print pd.read_json(my_json, orient='list') 

产生:

           entity 
0 {u'type': u'business', u'uuid': u'199bca3e-baf... 
1 {u'type': u'business', u'uuid': u'054a7650-b36... 

import pandas as pd 

my_json = """[ 
    { 
     "type": "business", 
     "uuid": "199bca3e-baf6-11e6-861b-0ad881f403bf", 
     "size": 918 
    }, 
    { 
     "type": "business", 
     "uuid": "054a7650-b36a-11e6-a734-122e0737977d", 
     "size": 984 
    } 
]""" 

print pd.read_json(my_json, orient='list') 

产生:

size  type         uuid 
0 918 business 199bca3e-baf6-11e6-861b-0ad881f403bf 
1 984 business 054a7650-b36a-11e6-a734-122e0737977d 
0

danielcorin我指出了正确的方向。我结束了必须做的:

pd.read_json(json.dumps(b_j['entities']) , orient='list') 

read_json方法需要一个字符串,所以我转储实体集合,并使用它。

2

如果my_json是一本字典,因为我怀疑,那么你可以跳过pd.read_json,只是做

pd.DataFrame(my_json['entities']) 

    size  type         uuid 
0 730 business 508d56f1-636b-11e7-9928-122e0737977d 
1 730 business 2f3bd4dc-636b-11e7-b937-0ad881f403bf