2016-09-19 55 views
3

我有一个熊猫数据帧是这样的:熊猫数据框中回报指数不准确的小数

   0   1   2   3   4   5  \ 
    event_at                
    0.00  1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 
    0.01  0.975381 0.959061 0.979856 0.985625 0.986080 0.976601 
    0.02  0.959103 0.932374 0.966486 0.976037 0.976791 0.961114 
    0.03  0.946154 0.911362 0.955820 0.968362 0.969353 0.948785 
    0.04  0.935378 0.894024 0.946924 0.961940 0.963129 0.938518 
    0.05  0.926099 0.879201 0.939248 0.956385 0.957744 0.929672 
    0.06  0.917608 0.865726 0.932212 0.951282 0.952796 0.921574 
    ...... 
    0.96  0.072472 0.012264 0.117352 0.217737 0.228561 0.082670 
    0.97  0.066553 0.010632 0.109468 0.207225 0.217870 0.076244 
    0.98  0.060532 0.009069 0.101313 0.196119 0.206555 0.069677 
    0.99  0.054657 0.007642 0.093212 0.184828 0.195031 0.063237 
    1.00  0.019128 0.001314 0.039558 0.100442 0.108064 0.023328 

我想获得的所有索引

>>> df.index 
[0.0, 0.01, 0.02, 0.029999999999999999, 0.040000000000000001, 0.050000000000000003, 0.059999999999999998, 
... 
0.95999999999999996, 0.96999999999999997, 0.97999999999999998, 0.98999999999999999, 1.0] 


# What I expect is like: 

    [0.0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 
     ... 
     0.96, 0.97, 0.98, 0.99, 1.0] 

此浮点问题让我得到他的异常:

>>> df.loc[0.35].values 
Traceback (most recent call last): 
    File "I:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1395, in _has_valid_type 
    error() 
    File "I:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1390, in error 
    (key, self.obj._get_axis_name(axis))) 
KeyError: 'the label [0.35] is not in the [index]' 

During handling of the above exception, another exception occurred: 

Traceback (most recent call last): 
    File "J:\Workspace\dataset_loader.py", line 171, in <module> 
    print(y_pred_cox_alldep.loc[0.35].values) 
    File "I:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1296, in __getitem__ 
    return self._getitem_axis(key, axis=0) 
    File "I:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1466, in _getitem_axis 
    self._has_valid_type(key, axis) 
    File "I:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1403, in _has_valid_type 
    error() 
    File "I:\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1390, in error 
    (key, self.obj._get_axis_name(axis))) 
KeyError: 'the label [0.35] is not in the [index]' 
+0

一般来说,浮点数或平等测试的索引有这个问题。很容易针对对方测试整数,但只能用浮点数“靠近”。你可能也想看看字符串索引。 – hpaulj

回答

2

你可以这样做(假设我们想要得到一行0.96索引,这是内部的y所表示为0.95999999999):

In [466]: df.index 
Out[466]: Float64Index([0.0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.95999999999, 0.97, 0.98, 0.99, 1.0], dtype='float64') 

In [467]: df.ix[df.index[np.abs(df.index - 0.96) < 1e-6]] 
Out[467]: 
      0   1   2   3   4  5 
0.96 0.072472 0.012264 0.117352 0.217737 0.228561 0.08267 

,或者,如果你可以改变(圆形)索引:

In [430]: df.index = [0.0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.95999999999, 0.97, 0.98, 0.99, 1.0] 

In [431]: df 
Out[431]: 
      0   1   2   3   4   5 
0.00 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 
0.01 0.975381 0.959061 0.979856 0.985625 0.986080 0.976601 
0.02 0.959103 0.932374 0.966486 0.976037 0.976791 0.961114 
0.03 0.946154 0.911362 0.955820 0.968362 0.969353 0.948785 
0.04 0.935378 0.894024 0.946924 0.961940 0.963129 0.938518 
0.05 0.926099 0.879201 0.939248 0.956385 0.957744 0.929672 
0.06 0.917608 0.865726 0.932212 0.951282 0.952796 0.921574 
0.96 0.072472 0.012264 0.117352 0.217737 0.228561 0.082670 
0.97 0.066553 0.010632 0.109468 0.207225 0.217870 0.076244 
0.98 0.060532 0.009069 0.101313 0.196119 0.206555 0.069677 
0.99 0.054657 0.007642 0.093212 0.184828 0.195031 0.063237 
1.00 0.019128 0.001314 0.039558 0.100442 0.108064 0.023328 

In [432]: df.index 
Out[432]: Float64Index([0.0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.95999999999, 0.97, 0.98, 0.99, 1.0], dtype='float64') 

In [433]: df.ix[.96] 
... skipped ... 
KeyError: 0.96 

我们再来一轮指数:

In [434]: df.index = df.index.values.round(2) 

In [435]: df.index 
Out[435]: Float64Index([0.0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.96, 0.97, 0.98, 0.99, 1.0], dtype='float64') 

In [436]: df.ix[.96] 
Out[436]: 
0 0.072472 
1 0.012264 
2 0.117352 
3 0.217737 
4 0.228561 
5 0.082670 
Name: 0.96, dtype: float64 

更新:从Pandas 0.20.1 the .ix indexer is deprecated, in favor of the more strict .iloc and .loc indexers开始。