1
我有一个数据帧my_df
,那么我想创建一个新的数据帧new_df
。每个new_df
列由groupby
my_id
创建,然后取my_df
中列的max
。熊猫:为多列数据框实现groupby +聚合的优雅方法?
下面是我的代码,它工作正常。但是,我想知道有没有更好的方法?特别是在未来,我将处理数百列而不是仅仅6列?非常感谢!
tmp_df1 = my_df.groupby(['my_id'], as_index=False).col_A.agg({"max_A": "max"})
tmp_df2 = my_df.groupby(['my_id'], as_index=False).col_B.agg({"max_B": "max"})
tmp_df3 = my_df.groupby(['my_id'], as_index=False).col_C.agg({"max_C": "max"})
tmp_df4 = my_df.groupby(['my_id'], as_index=False).col_D.agg({"max_D": "max"})
tmp_df5 = my_df.groupby(['my_id'], as_index=False).col_E.agg({"max_E": "max"})
tmp_df6 = my_df.groupby(['my_id'], as_index=False).col_F.agg({"max_F": "max"})
combine_df1 = pd.merge(tmp_df1,tmp_df2,how="inner",on=['my_id'])
combine_df2 = pd.merge(combine_df1,tmp_df3,how="inner",on=['my_id'])
combine_df3 = pd.merge(combine_df2,tmp_df4,how="inner",on=['my_id'])
combine_df4 = pd.merge(combine_df3,tmp_df5,how="inner",on=['my_id'])
new_df = pd.merge(combine_df4,tmp_df6,how="inner",on=['my_id'])
是否有可能在过程中给这些new_df列新的名字呢?即A_max而不是col_A,B_max而不是col_B ...等等?我正在尝试以后再次手动重命名每一列......谢谢! – Edamame
@Edamame我已更新我的帖子。 – piRSquared