2016-04-20 97 views
0

相关的问题在这里:Reordering pandas dataframe based on multiple column and sum of one column接受的大熊猫数据框顶部行基于分组

我如何使用sort列时接受前2个国家在这个数据帧,: 在这种情况下,顶部2个国家将在澳大利亚和阿富汗

Country_FAO type mean_area  sort 
5 Australia car 12141000.0 18910501.0 
4 Australia car 6475695.0 18910501.0 
6 Australia bus 293806.0 18910501.0 
0 Afghanistan car 2029000.0 2141000.0 
1 Afghanistan car 112000.0 2141000.0 
2  Algeria bus 827000.0 829351.0 
3  Algeria bus  2351.0 829351.0 

- 编辑:

我也想保留type列。在这种情况下,解决方案应该是这样的:

Country_FAO type mean_area  sort 
5 Australia car 12141000.0 18910501.0 
4 Australia car 6475695.0 18910501.0 
6 Australia bus 293806.0 18910501.0 
0 Afghanistan car 2029000.0 2141000.0 
1 Afghanistan car 112000.0 2141000.0 

回答

1

UPDATE:

In [166]: df.loc[df.Country_FAO.isin(df.groupby('Country_FAO').sum().nlargest(2, 'mean_area').index)] 
Out[166]: 
    Country_FAO type mean_area  sort 
5 Australia car 12141000.0 18910501.0 
4 Australia car 6475695.0 18910501.0 
6 Australia bus 293806.0 18910501.0 
0 Afghanistan car 2029000.0 2141000.0 
1 Afghanistan car 112000.0 2141000.0 

我会做这种方式:

In [153]: df.groupby('Country_FAO').sum() 
Out[153]: 
       mean_area 
Country_FAO 
Afghanistan 2141000.0 
Algeria  829351.0 
Australia 18910501.0 

In [154]: df.groupby('Country_FAO').sum().nlargest(2, 'mean_area') 
Out[154]: 
       mean_area 
Country_FAO 
Australia 18910501.0 
Afghanistan 2141000.0 

In [155]: df.groupby('Country_FAO').sum().nlargest(2, 'mean_area').index 
Out[155]: Index(['Australia', 'Afghanistan'], dtype='object', name='Country_FAO') 

还,您可能需要重置您的索引:

In [156]: df.groupby('Country_FAO').sum().nlargest(2, 'mean_area').reset_index() 
Out[156]: 
    Country_FAO mean_area 
0 Australia 18910501.0 
1 Afghanistan 2141000.0 
+0

谢谢@MaxU,这个soln删除'type'列,有没有办法保留这个? – user308827

+0

@ user308827,我已经更新了我的答案 - 请检查 – MaxU

+0

谢谢@MaxU,此作品! – user308827