2016-08-23 100 views
2

我有一个DataFrame,我把它制作成一个数据透视表,但现在我想订购数据透视表,以便基于特定列的公共值彼此对齐。对于例如为了数据帧,使所有常见的国家对齐同一行:熊猫数据帧数据透视表和分组

data = {'dt': ['2016-08-22', '2016-08-21', '2016-08-22', '2016-08-21', '2016-08-21'], 
     'country':['uk', 'usa', 'fr','fr','uk'], 
     'number': [10, 21, 20, 10,12] 
     } 

df = pd.DataFrame(data) 
print df 

    country   dt number 
0  uk 2016-08-22  10 
1  usa 2016-08-21  21 
2  fr 2016-08-22  20 
3  fr 2016-08-21  10 
4  uk 2016-08-21  12 


#pivot table by dt: 

df['idx'] = df.groupby('dt')['dt'].cumcount() 
df_pivot = df.set_index(['idx','dt']).stack().unstack([1,2]) 
print df_pivot 
dt  2016-08-22  2016-08-21  
     country number country number 
idx          
0   uk  10  usa  21 
1   fr  20   fr  10 
2   NaN NaN   uk  12 


#what I really want: 

     dt 2016-08-22 2016-08-21  
     country number country number 

0   uk  10   uk  12 
1   fr  20   fr  10 
2   NaN NaN  usa  21 

甚至更​​好:来自2016-08-222016-08-21

   2016-08-22 2016-08-21  
     country number  number 

0   uk  10   12 
1   fr  20   10 
2   usa NaN   21 

uk值在同一行排列

回答

1

您可以使用:

df_pivot = df.set_index(['dt','country']).stack().unstack([0,2]).reset_index() 
print (df_pivot) 
dt country 2016-08-22 2016-08-21 
       number  number 
0  fr  20.0  10.0 
1  uk  10.0  12.0 
2  usa  NaN  21.0 

#change first value of Multiindex from first to second level 
cols = [col for col in df_pivot.columns] 
df_pivot.columns = pd.MultiIndex.from_tuples([('','country')] + cols[1:]) 
print (df_pivot) 
      2016-08-22 2016-08-21 
    country  number  number 
0  fr  20.0  10.0 
1  uk  10.0  12.0 
2  usa  NaN  21.0 

另一个更简单的解决方案是用pivot

df_pivot = df.pivot(index='country', columns='dt', values='number') 
print (df_pivot) 
dt  2016-08-21 2016-08-22 
country       
fr    10.0  20.0 
uk    12.0  10.0 
usa   21.0   NaN 
+0

谢谢你接受! – jezrael