2016-08-18 78 views
0

我试图从分层DataFrame(python 3.5)创建一个嵌套的JSON对象,将其馈入JavaScript以呈现组织结构图。我基本上是试图建立在这个问题的答案中发现的结构:Organization chart - tree, online, dynamic, collapsible, pictures - in D3从Pandas为组织结构图创建嵌套的JSON

一个例子数据框:

df = pd.DataFrame({\ 
'Manager_Name':['Mike' ,'Jon', 'Susan' ,'Susan' ,'Joe'],\ 
'Manager_Title':['Level1' ,'Level2' ,'Level3' ,"Level3", 'Level4'],\ 
'Employee_Name':['Jon' ,'Susan' ,'Josh' ,'Joe' ,'Jimmy'],\ 
'Employee_Title':["Level2" ,"Level3" ,"Level4" ,"Level4" ,"Level5"]}) 

所需的输出将是:

"Name": "Mike" 
"Title": "Level1" 
"Employees": [{ 
     "Name": "Jon" 
     "Title": "Level2" 
     "Employees": [{ 
       "Name": "Susan" 
       "Title": "Level3" 
       "Employees": [{ 
       ... 
       ... 
       ... 
       }] 
     }] 
}] 

我知道这ISN代码生成服务,但我尝试应用其他类似的相关答案,似乎无法在这里应用这些答案。我也没有太多的字典(我更像是一个R人),所以这个问题可能有些不好。我有更多的时间比我应该在这个,但我敢肯定这里有人可以在几分钟内做到这一点。

其他问题:

提前感谢!

+0

什么是根元素的标准是什么?它是'Title =='Level1'吗? –

回答

0

考虑按Level过滤掉数据帧,然后将dfs转换为字典与熊猫to_dict(),它们不断地汇总到一个列表中。下面定义的函数从上一级到第一级逐步累积各个员工级别字典。但首先你应该连接经理和员工名称标题列。

import json 
import pandas as pd 

cdf = pd.concat([df[['Manager_Name', 'Manager_Title']].\ 
      rename(index=str, columns={'Manager_Name':'Name', 'Manager_Title':'Title'}), 
      df[['Employee_Name', 'Employee_Title']].\ 
      rename(index=str, columns={'Employee_Name':'Name', 'Employee_Title':'Title'})]) 

cdf = cdf.drop_duplicates().reset_index(drop=True) 
print(cdf) 
#  Name Title 
# 0 Mike Level1 
# 1 Jon Level2 
# 2 Susan Level3 
# 3 Joe Level4 
# 4 Josh Level4 
# 5 Jimmy Level5 

def jsondict(): 
    inner = [''] 
    for i in ['Level5', 'Level4', 'Level3', 'Level2']:    
     if i == 'Level5': 
      inner[0] = cdf[cdf['Title']==i].to_dict(orient='records')    
     else: 
      tmp = cdf[cdf['Title']==i].copy().reset_index(drop=True)    
      if len(tmp) == 1: 
       tmp['Employees'] = [inner[0]] 
      else: 
       for d in range(0,len(tmp)): 
        tmp.ix[d, 'Employees'] = [inner[0]]         
      lvltemp = tmp.to_dict(orient='records') 
      inner[0] = lvltemp    
    return(inner) 

jsondf = cdf[cdf['Title']=='Level1'].copy() 
jsondf['Employees'] = jsondict()  
jsondata = jsondf.to_json(orient='records') 

输出

[{"Name":"Mike","Title":"Level1","Employees": 
[{"Name":"Jon","Title":"Level2","Employees": 
[{"Name":"Susan","Title":"Level3","Employees": 
[{"Name":"Joe","Title":"Level4","Employees": 
[{"Name":"Jimmy","Title":"Level5"}]}, 
{"Name":"Josh","Title":"Level4","Employees": 
[[{"Name":"Jimmy","Title":"Level5"}]]}]}]}]}] 

还是蛮印刷

[ 
    { 
    "Name": "Mike", 
    "Title": "Level1", 
    "Employees": [ 
     { 
     "Name": "Jon", 
     "Title": "Level2", 
     "Employees": [ 
      { 
      "Name": "Susan", 
      "Title": "Level3", 
      "Employees": [ 
       { 
       "Name": "Joe", 
       "Title": "Level4", 
       "Employees": [ 
        { 
        "Name": "Jimmy", 
        "Title": "Level5" 
        } 
       ] 
       }, 
       { 
       "Name": "Josh", 
       "Title": "Level4", 
       "Employees": [ 
        [ 
        { 
         "Name": "Jimmy", 
         "Title": "Level5" 
        } 
        ] 
       ] 
       } 
      ] 
      } 
     ] 
     } 
    ] 
    } 
] 
+0

此功能只会为每位经理返回一名员工。这是一个很好的起点。它需要一些调整 - 稍后会给它一个 – mrp

+0

你在较大的数据集上运行这个吗? – Parfait

+0

不,我在显示的示例数据集上运行这个。发生这种情况的原因是因为您只需要首先设置词典记录:.to_dict(...)[0]。您还可以在输出中看到Josh(4级)丢失 – mrp