看来它只能与参数limit
看到docs [In 47]:
添加limit_direction关键字参数与限制工作,使插值填写NaN值向前,向后,或两者(GH9218, GH10420,GH11115)
records = pd.DataFrame(
{'name': {0: 'a', 1: 'a', 2: 'a', 3: 'a', 4: 'a', 5: 'a', 6: 'a', 7: 'a', 8: 'a', 9: 'a'},
'days': {0: 0.0, 1: np.nan, 2: np.nan, 3: np.nan, 4: 4.0, 5: 5.0, 6: np.nan, 7: np.nan, 8: np.nan, 9: 9.0}},
columns=['name','days'])
print (records)
name days
0 a 0.0
1 a NaN
2 a NaN
3 a NaN
4 a 4.0
5 a 5.0
6 a NaN
7 a NaN
8 a NaN
9 a 9.0
#by default limit_direction='forward'
records['forw'] = records['days'].interpolate(method='linear',
limit=1)
records['backw'] = records['days'].interpolate(method='linear',
limit_direction='backward',
limit=1)
records['both'] = records['days'].interpolate(method='linear',
limit_direction='both',
limit=1)
print (records)
name days forw backw both
0 a 0.0 0.0 0.0 0.0
1 a NaN 1.0 NaN 1.0
2 a NaN NaN NaN NaN
3 a NaN NaN 3.0 3.0
4 a 4.0 4.0 4.0 4.0
5 a 5.0 5.0 5.0 5.0
6 a NaN 6.0 NaN 6.0
7 a NaN NaN NaN NaN
8 a NaN NaN 8.0 8.0
9 a 9.0 9.0 9.0 9.0
您的示例*(显示6行)*不会工作(值将与最后一个已知值保持相同),因为'interpolate'需要知道'Nan'之后的第一个有效值,以基于测量它们的值将被填充的行之间的差异。所以,当你指定它的起点和终点时,线性插值效果最好,这样它可以很好地平滑它在中途遇到的'NaN'值。 –
当前版本的熊猫(0.22)似乎是用'limit_direction ='both''来实现的。虽然开始和结束'NaN'值现在**填充**。 –