计算距离的类似箱子（观察值）最小中的R

我有描述了应用3种算法的若干案件。对于算法和壳体的每个组合的结果的数据集，有一个结果。计算距离的类似箱子（观察值）最小中的R

df = data.frame(
    c("case1", "case1", "case1", "case2", "case2", "case2"), 
    c("algo1", "algo2", "algo3", "algo1", "algo2", "algo3"), 
    c(10, 11, 12, 22, 23, 20) 
); 
names(df) <- c("case", "algorithm", "result"); 
df

这些算法旨在最小化结果值。因此，对于每种算法和案例，我想要计算差距，以达到最低的实现结果，通过任何算法达到相同的情况。

gap <- function(caseId, result) { 
    filtered = subset(df, case==caseId) 
    return (result - min(filtered[,'result'])); 
}

当我手动应用该功能时，我得到了预期的结果。

gap("case1", 10) # prints 0, since 10 is the best value for case1 
gap("case1", 11) # prints 1, since 11-10=1 
gap("case1", 12) # prints 2, since 12-10=1 

gap("case2", 22) # prints 2, since 22-20=2 
gap("case2", 23) # prints 3, since 23-20=3 
gap("case2", 20) # prints 0, since 20 is the best value for case2

但是，当我想要计算整个数据集中的新列时，我得到case2的假结果。

df$gap <- gap(df$case, df$result) 
df

这将产生

case algorithm result gap 
1 case1  algo1  10 0 
2 case1  algo2  11 1 
3 case1  algo3  12 2 
4 case2  algo1  22 12 
5 case2  algo2  23 13 
6 case2  algo3  20 10

看来，现在的差距功能工作对整体结果最低整个数据帧的，而应该只考虑具有相同情况行。也许在缺口功能中的子集过滤不能正常工作？

来源

2017-07-26 George Blackburn

减去我们可以使用dplyr

library(dplyr) 
df %>% 
    group_by(case) %>% 
    mutate(result = result - min(result)) 
# A tibble: 6 x 3 
# Groups: case [2] 
# case algorithm result 
# <fctr> <fctr> <dbl> 
#1 case1  algo1  0 
#2 case1  algo2  1 
#3 case1  algo3  2 
#4 case2  algo1  2 
#5 case2  algo2  3 
#6 case2  algo3  0

来源

2017-07-26 02:06:40 akrun

使用ave获得最小值为每个组和result

df$result - ave(df$result, df$case, FUN = min) 
#[1] 0 1 2 2 3 0

来源

2017-07-26 00:10:58

这一次也适用。要添加我想要的列“gap”，需要使用'df $ gap < - df $ result - ave（df $ result，df $ case，FUN = min）'。 –

计算距离的类似箱子（观察值）最小中的R

回答

相关问题