2017-10-17 102 views
0

我产生了熊猫透视表看起来像这样一个数据透视表的列:熊猫合并

                Encounters 
                Code 132 133 145 
    Record Number Start_date End_date Service_Date 
    2322   1/1/2017 1/3/2017 1/1/2017    0  1 1 
             1/2/2017    1  0 0 
             1/3/2017    0  1 1 

我想合并,总结基于代码

期望中的一些数据透视表列输出:

               Encounters 
                Code 132 133-145 
    Record Number Start_date End_date Service_Date 
    2322   1/1/2017 1/3/2017 1/1/2017    0  2  
             1/2/2017    1  0  
             1/3/2017    0  2  
+0

你想如何处理非连续的相遇范围? – bear

回答

1

数据透视表创建分层列(即多个级别)。因此,考虑使用不同级别的元组分配分配一个新的,总和柱:

数据(接种有枢轴数据)

import numpy as np 
import pandas as pd 
import datetime as dt 
import time 

LETTERS = list('ABCDEFGHIJKLMNOPQRSTUVWXYZ')  
epoch_time = int(time.time()) 

np.random.seed(555) 
df = pd.DataFrame({'ID': [np.random.randint(15) for _ in range(50)], 
        'GROUP': ["".join(np.random.choice(LETTERS[0:3],1)) for _ in range(50)], 
        'NUM': np.random.uniform(50)/100, 
        'DATE': [dt.datetime.fromtimestamp(np.random.randint(low=1400270738, 
          high=epoch_time)) for _ in range(50)]}) 

df['YEAR'] = df['DATE'].dt.year 
pvtdf = df.pivot_table(index = ['ID'], columns = ['YEAR', 'GROUP'], values = ['NUM']).fillna(0) 

print(pvtdf) 
#    NUM                            
# YEAR  2014       2015       2016       2017      
# GROUP   A   B   C   A   B   C   A   B   C   A   B   C 
# ID                               
# 0  0.000000 0.000000 0.000000 0.411258 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.411258 0.411258 
# 1  0.000000 0.411258 0.000000 0.000000 0.000000 0.000000 0.411258 0.411258 0.000000 0.000000 0.411258 0.411258 
# 3  0.411258 0.000000 0.000000 0.000000 0.000000 0.000000 0.411258 0.411258 0.411258 0.000000 0.411258 0.000000 
# 4  0.411258 0.411258 0.000000 0.000000 0.411258 0.411258 0.000000 0.000000 0.411258 0.411258 0.000000 0.000000 
# 5  0.411258 0.000000 0.000000 0.411258 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.411258 
# 6  0.000000 0.411258 0.000000 0.000000 0.411258 0.411258 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 
# 7  0.000000 0.000000 0.000000 0.411258 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 
# 8  0.000000 0.411258 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.411258 
# 9  0.000000 0.000000 0.411258 0.411258 0.000000 0.411258 0.411258 0.000000 0.000000 0.000000 0.000000 0.000000 
# 10  0.000000 0.000000 0.000000 0.411258 0.411258 0.000000 0.000000 0.411258 0.000000 0.000000 0.000000 0.000000 
# 11  0.000000 0.000000 0.000000 0.411258 0.000000 0.000000 0.000000 0.000000 0.411258 0.000000 0.000000 0.000000 
# 12  0.411258 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 
# 13  0.000000 0.411258 0.000000 0.000000 0.000000 0.000000 0.411258 0.000000 0.411258 0.000000 0.411258 0.000000 
# 14  0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.411258 0.000000 

df[('Encounters', '133-145')] = df[('Encounters', '133')] + df[('Encounters', '145')] 

del df[('Encounters', '133')] 
del df[('Encounters', '145')] 

df.sortlevel(0, axis=1, inplace=True) 

为了用随机数据表明

工艺(全部2017年A,B,C列加入D然后取出)

pvtdf[('NUM', 2017, 'D')] = pvtdf[('NUM', 2017, 'A')] + pvtdf[('NUM', 2017, 'B')] + pvtdf[('NUM', 2017, 'C')] 

pvtdf = pvtdf.drop([('NUM', 2017, 'A'), ('NUM', 2017, 'B'), ('NUM', 2017, 'C')], axis=1)  
pvtdf.sortlevel(0, axis=1, inplace=True) 

print(pvtdf)  
#    NUM                       
# YEAR  2014       2015       2016       2017 
# GROUP   A   B   C   A   B   C   A   B   C   D 
# ID                          
# 0  0.000000 0.000000 0.000000 0.411258 0.000000 0.000000 0.000000 0.000000 0.000000 0.822515 
# 1  0.000000 0.411258 0.000000 0.000000 0.000000 0.000000 0.411258 0.411258 0.000000 0.822515 
# 3  0.411258 0.000000 0.000000 0.000000 0.000000 0.000000 0.411258 0.411258 0.411258 0.411258 
# 4  0.411258 0.411258 0.000000 0.000000 0.411258 0.411258 0.000000 0.000000 0.411258 0.411258 
# 5  0.411258 0.000000 0.000000 0.411258 0.000000 0.000000 0.000000 0.000000 0.000000 0.411258 
# 6  0.000000 0.411258 0.000000 0.000000 0.411258 0.411258 0.000000 0.000000 0.000000 0.000000 
# 7  0.000000 0.000000 0.000000 0.411258 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 
# 8  0.000000 0.411258 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.411258 
# 9  0.000000 0.000000 0.411258 0.411258 0.000000 0.411258 0.411258 0.000000 0.000000 0.000000 
# 10  0.000000 0.000000 0.000000 0.411258 0.411258 0.000000 0.000000 0.411258 0.000000 0.000000 
# 11  0.000000 0.000000 0.000000 0.411258 0.000000 0.000000 0.000000 0.000000 0.411258 0.000000 
# 12  0.411258 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 
# 13  0.000000 0.411258 0.000000 0.000000 0.000000 0.000000 0.411258 0.000000 0.411258 0.411258 
# 14  0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.411258