2016-12-06 83 views
1

我可以保留DF10 & DF20的名称相同,并呼吁他们运行FUNC(DF)之后单独,或甚至将其重命名?运行FUNC(DF)创建新的dataframes并重新命名

df = pd.DataFrame({ 
    'A': ['d','d','d','d','d','d','g','g','g','g','g','g','k','k','k','k','k','k'], 
    'B': [5,5,6,4,5,6,-6,7,7,6,-7,7,-8,7,-6,6,-7,50], 
    'C': [1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2], 
    'S': [2012,2013,2014,2015,2016,2012,2012,2014,2015,2016,2012,2013,2012,2013,2014,2015,2016,2014]  
    }); 

df10 = (df.B + df.C).groupby([df.A, df.S]).agg(['sum','size']).unstack(fill_value=0) 

df20 = (df['B'] - df['C']).groupby([df.A, df.S]).agg(['sum','size']).unstack(fill_value=0) 

def func(df): 
    df1 = df.groupby(level=0, axis=1).sum() 
    new_cols= list(zip(df1.columns.get_level_values(0),['total'] * len(df.columns))) 
    df1.columns = pd.MultiIndex.from_tuples(new_cols) 
    df2 = pd.concat([df1,df], axis=1).sort_index(axis=1).sort_index(axis=1, level=1) 
    df2.columns = ['_'.join((col[0], str(col[1]))) for col in df2.columns] 
    df2.columns = df2.columns.str.replace('sum_','') 
    df2.columns = df2.columns.str.replace('size_','T') 
    return df2 

dfs = [] 
for df in [df10, df20]: 
    dfs.append(func(df)) 

dfs 

回答

1

可以使用dict用于存储和列表创建存储在dfsDataFrames新的名称:

names = ['a','b'] 
dfs = {names[i]:func(df) for i,df in enumerate([df10, df20])} 
print (dfs) 
{'a': T2012 2012 T2013 2013 T2014 2014 T2015 2015 T2016 2016 Ttotal \ 
A                    
d  2 13  1  6  1  7  1  5  1  6  6 
g  2 -11  1  8  1  8  1  8  1  7  6 
k  1 -6  1  9  2 48  1  8  1 -5  6 

    total 
A   
d  37 
g  20 
k  54 , 'b': T2012 2012 T2013 2013 T2014 2014 T2015 2015 T2016 2016 Ttotal \ 
A                    
d  2  9  1  4  1  5  1  3  1  4  6 
g  2 -15  1  6  1  6  1  6  1  5  6 
k  1 -10  1  5  2 40  1  4  1 -9  6 

    total 
A   
d  25 
g  8 
k  30 } 
​​
print (dfs['b']) 
    T2012 2012 T2013 2013 T2014 2014 T2015 2015 T2016 2016 Ttotal \ 
A                    
d  2  9  1  4  1  5  1  3  1  4  6 
g  2 -15  1  6  1  6  1  6  1  5  6 
k  1 -10  1  5  2 40  1  4  1 -9  6 

    total 
A   
d  25 
g  8 
k  30 

但如果需要的DataFrames同名,你可以分配看跌期权的功能func到相同的变量:

df10 = func(df10) 
df20 = func(df20) 
print (df10) 
    T2012 2012 T2013 2013 T2014 2014 T2015 2015 T2016 2016 Ttotal \ 
A                    
d  2 13  1  6  1  7  1  5  1  6  6 
g  2 -11  1  8  1  8  1  8  1  7  6 
k  1 -6  1  9  2 48  1  8  1 -5  6 

    total 
A   
d  37 
g  20 
k  54 
print (df20) 
    T2012 2012 T2013 2013 T2014 2014 T2015 2015 T2016 2016 Ttotal \ 
A                    
d  2  9  1  4  1  5  1  3  1  4  6 
g  2 -15  1  6  1  6  1  6  1  5  6 
k  1 -10  1  5  2 40  1  4  1 -9  6 

    total 
A   
d  25 
g  8 
k  30 

可以分配给新的变量也:

dfa = func(df10) 
dfb = func(df20) 
+0

很好,谢谢。他们是重命名后的常规数据框,对吧? – Zanshin

+0

是的,确切地说。正常的数据帧。 – jezrael