2017-01-03 131 views
3

让我们说我们有如下表 enter image description here如何行选择最大值和最小值的选定列

,我想找到最大值和最小值的每一行一组特定的列(让我们说CENSUS2010POP,ESTIMATESBASE1010,POPESTIMATE2010)。 如何用大熊猫做到这一点?

+1

从列寻找最大值..参考http://stackoverflow.com/questions/15741759/find- maximum-value-of-a-column-and-return-the-corresponding-row-values-using-pan – Harsha

+0

@HarshaBiyani我知道很热,可以在列中找到最大值...但是我需要一行,并且只考虑连续几列 – YohanRoth

+1

[Python Pandas sele最大值可能重复(http://stackoverflow.com/questions/20033111/python-pandas-max-value-of-selected-columns) – RomanPerekhrest

回答

3

我想你需要​​和max

df_subset=df.set_index('CTYNAME')[['CENSUS2010POP', 'ESTIMATESBASE1010', 'POPESTIMATE2010']] 
df1 = df_subset.min(axis=1) 
print (df1) 

df2= df_subset.max(axis=1) 
print (df2) 

编辑:

df = pd.DataFrame({'CTYNAME':['Alabama','Autauga County','Baldwin County','Barbour County'], 
        'CENSUS2010POP':[4,5,6,2], 
        'ESTIMATESBASE1010':[7,8,9,3], 
        'POPESTIMATE2010':[1,3,5,5]}) 

print (df) 
    CENSUS2010POP   CTYNAME ESTIMATESBASE1010 POPESTIMATE2010 
0    4   Alabama     7    1 
1    5 Autauga County     8    3 
2    6 Baldwin County     9    5 
3    2 Barbour County 

df_subset=df.set_index('CTYNAME')[['CENSUS2010POP', 'ESTIMATESBASE1010', 'POPESTIMATE2010']] 
df1 = df_subset.max(axis=1) - df_subset.min(axis=1) 
print (df1) 
CTYNAME 
Alabama   6 
Autauga County 5 
Baldwin County 4 
Barbour County 3 
dtype: int64 

print (df1.nlargest(1).reset_index(name='top1')) 
    CTYNAME top1 
0 Alabama  6 
+0

如何做到这一点来保存整个CTYNAME?我想找到最大和最小值之间的差异,然后选择最大差异的CTYNAME ... – YohanRoth

+0

也如何做到这一点,而不会产生重复的子集 – YohanRoth

+0

你认为最大的差异在“CENSUS2010POP”,“ESTIMATESBASE1010”,“ POPESTIMATE2010''? – jezrael