2017-10-18 95 views
1

鉴于我有以下熊猫数据帧:滤波值大熊猫数据帧由条件

arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux']), 
      np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']), 
      np.array([0.01, 0.2, 0.3, -0.5, 0.6, -0.7, -0.8, 0.9])] 

tuples = list(zip(*arrays)) 
df_index = pd.MultiIndex.from_tuples(tuples, names=['A', 'B', 'measure']) 

df = pd.DataFrame(np.random.randn(8, 4), index=df_index) 
print(df) 

如何过滤所有的值,其中例如量度柱(其是部分指数)是否大于0.2?

我曾尝试:

df.loc[:,:,0.1:0.9] 

(和这个其他的变化,但我得到的错误 “IndexingError:太多索引”

感谢, 杰拉德

回答

5
In [3]: df.query("measure > 0.2") 
Out[3]: 
         0   1   2   3 
A B measure 
baz one 0.3  0.623507 0.602585 -0.792142 2.066095 
foo one 0.6  0.138192 -0.159108 -1.796944 1.668463 
qux two 0.9  -0.162210 -2.293951 0.602990 1.622783 

In [6]: df.loc[pd.IndexSlice[:,:,0.200001:], :] 
Out[6]: 
         0   1   2   3 
A B measure 
baz one 0.3  0.623507 0.602585 -0.792142 2.066095 
foo one 0.6  0.138192 -0.159108 -1.796944 1.668463 
qux two 0.9  -0.162210 -2.293951 0.602990 1.622783 
4

喜欢的东西get_level_values

df[df.index.get_level_values(2)>0.2] 
Out[35]: 
         0   1   2   3 
A B measure           
baz one 0.3  -0.235196 0.183122 -1.620810 0.912996 
foo one 0.6  -1.456278 -1.144081 -0.872170 0.547008 
qux two 0.9  0.942656 -0.435219 -0.161408 -0.451456 
2

该做的伎俩:

df.iloc[df.index.get_level_values(2) >= 0.2] 

或者,如果你喜欢:

df.iloc[df.index.get_level_values('measure') >= 0.2] 
0

与您最初的做法,你可以使用IndexSlice

df.sort_index().loc[pd.IndexSlice[:, :, 0.2:], :]