我想在应用groupby函数后使用列变量的标准偏差从熊猫数据框中删除异常值。应用.groupby()争论后用熊猫数据框中的NaN代替异常值
这是我的数据帧:
ARI Flesch Kincaid Speaker Score
0 -2.090000 121.220000 -3.400000 NaN NaN
1 8.276460 64.478573 9.034156 William Dudley 1.670275
2 19.570911 27.362067 17.253580 Janet Yellen -0.604757
3 -2.090000 121.220000 -3.400000 NaN NaN
4 -2.090000 121.220000 -3.400000 NaN NaN
5 20.643483 17.069411 18.394178 Lael Brainard 0.215396
6 -2.090000 121.220000 -3.400000 NaN NaN
7 -2.090000 121.220000 -3.400000 NaN NaN
8 12.624198 52.220468 11.403157 Jerome H. Powell -1.350798
9 18.466305 35.186261 16.205693 Stanley Fischer 0.522121
10 -2.090000 121.220000 -3.400000 NaN NaN
11 16.953460 36.246573 15.323457 Lael Brainard -0.217779
12 -2.090000 121.220000 -3.400000 NaN NaN
13 -2.090000 121.220000 -3.400000 NaN NaN
14 17.066088 32.592551 16.108486 Stanley Fischer 0.642245
15 -2.090000 121.220000 -3.400000 NaN NaN
我想第一组数据帧由“扬声器”,然后除去“ARI”,“弗莱士”和“金凯德”值异常值所界定与特定特征的平均值相比超过3个标准偏差。
请让我知道这是否可能。谢谢!
你可以把你的数据的片段,而不是附加图像?人们更容易复制它。 – titipata
更好吗?谢谢! –
完美,谢谢格雷厄姆。有人会很快解决它:) – titipata