2017-10-05 81 views
0

透视表阿结合2只大熊猫创建枢轴表

         Balance Deployed 
Type  Environment OS  Model      
SupplierA Network 1 Win 10 Model 1  1.0  4.0 
          Model 2  2.0  5.0 
      Network 2 Win 10 Model 1  3.0  6.0 
         Win 7 Model 2  NaN  7.0 

枢轴表B

         Balance Deployed 
Type  Environment OS  Model      
SupplierA Network 3 Win 10 Model 1  NaN  8.0 
          Model 2  NaN  9.0 
      Network 4 Win 10 Model 1  NaN  10.0 
         Win 7 Model 2  NaN  11.0 
         Win 7 Model 3  NaN  12.0 

结果

         N3/4 Bal  Bal N3/4 Deployed Deployed 
Type  Environment OS  Model      
SupplierA Network 1 Win 10 Model 1  Nan  1.0   8.0   4.0 
          Model 2  Nan  2.0   9.0   5.0 
      Network 2 Win 10 Model 1  Nan  3.0   10.0   6.0 
         Win 7 Model 2  Nan  NaN   11.0   7.0 
         Win 7 Model 3  Nan  NaN   12.0   7.0 

网络3和网络4实际上是网络1和子集网络2。

如何将数据透视表B结果合并到数据透视表中使用python熊猫的结果。

样品的编号:

filter1 = df[(df["Type"]!="")] 
table1 = pd.pivot_table(filter1,index=     
["Type","Env","OperSys","Model"],columns= 
["AssetLifecycleStatus"],values="Serial   
Number",aggfunc='count',margins=True,dropna=True) 
table1 = table1.reindex(['Network 1','Network 2'], level=1) 
table1 = table1.reindex_axis(['Balance','Deployed], axis=1) 
table1.index = table1.index.set_names('Environment', level=1) 
table1.index = table1.index.set_names('OS', level=2) 

回答

0

由于枢表共享相同multindex,可以考虑使用.join

mergedf = table1.join(table2, lsuffix='_', rsuffix='') 

要使用随机数据证明(接种重现相同的值):

import numpy as np 
import pandas as pd 
import datetime as dt 
import time 

LETTERS = list('ABCDEFGHIJKLMNOPQRSTUVWXYZ') 
epoch_time = int(time.time()) 

dfs = [] 
for i in range(2): 
    np.random.seed(1000 + i) 
    tmp = pd.DataFrame({'ID': [np.random.randint(1,3) for _ in range(50)], 
         'GROUP': ["".join(np.random.choice(LETTERS[1:5],1)) for _ in range(50)],     
         'NUM': np.random.randn(50)*100, 
         'BOOL': [np.random.choice([True,False],1).item(0) for _ in range(50)], 
         'DATE': [dt.datetime.fromtimestamp(np.random.randint(1407165459,epoch_time)) for _ in range(50)]}) 

    tmp['YEAR'] = tmp['DATE'].dt.year 

    dfs.append(tmp.pivot_table(index=['GROUP', 'BOOL', 'ID'], 
       columns='YEAR', values='NUM', aggfunc='sum').fillna(0)) 

mdf = dfs[0].join(dfs[1], lsuffix='_', rsuffix='') 

输出

print(dfs[0]) 
YEAR     2014  2015  2016  2017 
GROUP BOOL ID             
B  False 1  0.000000 -109.692126 -87.286656 0.000000 
      2  0.000000 -36.775578 0.000000 0.000000 
     True 1 85.743275 0.000000 41.534745 80.202451 
      2  0.000000 0.000000 65.034285 0.000000 
C  False 1  0.000000 -93.696928 139.442275 24.999852 
      2  0.000000 86.726082 0.000000 0.000000 
     True 1 29.132019 64.424261 0.000000 108.224081 
      2 145.431373 -111.116278 0.000000 -185.134785 
D  False 1  0.000000 113.341723 0.000000 0.000000 
      2  0.000000 98.740384 137.415560 0.000000 
     True 1  0.000000 -63.477170 164.748952 0.000000 
      2  0.000000 -457.161979 0.000000 -8.200619 
E  False 1 146.204730 104.853072 196.485406 -143.713939 
      2 269.964586 0.000000 -379.256574 0.000000 
     True 1 -142.337059 -15.032559 -153.805456 17.793711 
      2 -74.200575 0.000000 0.000000 -17.805445 

print(dfs[1]) 
YEAR     2014  2015  2016  2017 
GROUP BOOL ID             
B  False 1  0.000000 141.036715 7.471521 0.000000 
      2  0.000000 150.056514 -95.408040 50.899685 
     True 1 68.030610 52.015261 31.100536 -228.762932 
      2  0.000000 51.645003 0.000000 1.253947 
C  False 1  8.943704 0.000000 0.000000 96.047025 
      2  0.000000 207.756386 0.000000 94.866648 
     True 1  0.000000 10.923869 -24.716529 0.000000 
      2  0.000000 0.000000 -36.847404 9.879019 
D  False 1  8.454257 91.386381 -89.693662 33.769267 
      2  0.000000 72.420881 130.951512 85.189272 
     True 1 116.647215 -73.226222 -65.496555 0.000000 
      2  0.000000 155.958910 47.444020 -29.872307 
E  False 1  0.000000 0.000000 -111.854429 -159.037171 
      2  0.000000 0.000000 2.417443 -75.488531 
     True 1 -86.983855 0.000000 102.603068 -51.821700 
      2  0.000000 -65.017149 0.000000 7.690244 

print(mdf) 
YEAR     2014_  2015_  2016_  2017_  2014  2015  2016  2017 
GROUP BOOL ID                         
B  False 1  0.000000 -109.692126 -87.286656 0.000000 0.000000 141.036715 7.471521 0.000000 
      2  0.000000 -36.775578 0.000000 0.000000 0.000000 150.056514 -95.408040 50.899685 
     True 1 85.743275 0.000000 41.534745 80.202451 68.030610 52.015261 31.100536 -228.762932 
      2  0.000000 0.000000 65.034285 0.000000 0.000000 51.645003 0.000000 1.253947 
C  False 1  0.000000 -93.696928 139.442275 24.999852 8.943704 0.000000 0.000000 96.047025 
      2  0.000000 86.726082 0.000000 0.000000 0.000000 207.756386 0.000000 94.866648 
     True 1 29.132019 64.424261 0.000000 108.224081 0.000000 10.923869 -24.716529 0.000000 
      2 145.431373 -111.116278 0.000000 -185.134785 0.000000 0.000000 -36.847404 9.879019 
D  False 1  0.000000 113.341723 0.000000 0.000000 8.454257 91.386381 -89.693662 33.769267 
      2  0.000000 98.740384 137.415560 0.000000 0.000000 72.420881 130.951512 85.189272 
     True 1  0.000000 -63.477170 164.748952 0.000000 116.647215 -73.226222 -65.496555 0.000000 
      2  0.000000 -457.161979 0.000000 -8.200619 0.000000 155.958910 47.444020 -29.872307 
E  False 1 146.204730 104.853072 196.485406 -143.713939 0.000000 0.000000 -111.854429 -159.037171 
      2 269.964586 0.000000 -379.256574 0.000000 0.000000 0.000000 2.417443 -75.488531 
     True 1 -142.337059 -15.032559 -153.805456 17.793711 -86.983855 0.000000 102.603068 -51.821700 
      2 -74.200575 0.000000 0.000000 -17.805445 0.000000 -65.017149 0.000000 7.690244