2017-06-01 103 views
0

我将如何对数据框中的列进行递归计算?因此,对于数据帧R中的递归平均值

X1 <- runif(50, 0, 1) 

    X2 <- runif(50, 0, 10) 


    df <- data.frame(X1,X2) 

为平均值包括第1行计算装置,用于柱2,然后将平均包括1-2行,那么平均包括1-3行等等。我只找到了只适用于列表的函数rapply。

回答

1

这里有两个可能的选择:

# generic, not very efficient but you can use it to compute other functions, not only mean 
DF$recursiveMean <- sapply(1:nrow(DF),function(i) mean(DF$X2[1:i])) 
# very efficient way, but it only computes the mean 
DF$recursiveMean <- cumsum(DF$X2)/1:nrow(DF) 
+0

第一个选项是伟大的,因为我也想使用其他功能,谢谢! – Julia

+0

@Julia如果这是您需要的解决方案,请接受它作为答案。 – www

0

答案是列CumMean

df$CumSum <- cumsum(df$X2) 
df$CumMean <- df$CumSum/1:nrow(df) 
0

我觉得这让你想dplyr什么。

library(dplyr) 

X1 <- runif(50, 0, 1) 
X2 <- runif(50, 0, 10) 

df <- data_frame(X1,X2) 

df %>% 
    mutate(row_sum = X1 + X2, 
     # cumulative sum of the sum of X1 and X2 
     cum_sum = cumsum(row_sum), 
     # The number of rows to be included in the recursive mean times 
     # the number of original columns included (X1 and X2) 
     denom = seq_along(X1) * 2, 
     # the final recursive mean 
     rec_mean = cum_sum/denom) 

#> # A tibble: 50 x 6 
#>   X1  X2 row_sum cum_sum denom rec_mean 
#>   <dbl> <dbl> <dbl>  <dbl> <dbl>  <dbl> 
#> 1 0.74556627 1.218812 1.964378 1.964378  2 0.9821891 
#> 2 0.52028772 2.118244 2.638532 4.602910  4 1.1507275 
#> 3 0.82827969 5.964946 6.793226 11.396136  6 1.8993560 
#> 4 0.23165987 7.785801 8.017461 19.413597  8 2.4266997 
#> 5 0.94498383 4.913119 5.858103 25.271700 10 2.5271700 
#> 6 0.97138884 2.455740 3.427128 28.698828 12 2.3915690 
#> 7 0.65725366 6.619548 7.276801 35.975630 14 2.5696878 
#> 8 0.58452486 8.555269 9.139794 45.115424 16 2.8197140 
#> 9 0.06390008 1.881750 1.945650 47.061074 18 2.6145041 
#> 10 0.95357395 7.336068 8.289642 55.350716 20 2.7675358 
#> # ... with 40 more rows