2016-11-14 37 views
0

使用top_n对于这样data.frame我想提出一个函数将返回选定变量的5点大意见:如何在一个R函数

df1 <- structure(list(Yta = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L 
), Rad = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), Planta = c(1L, 
2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L), Sortnr = c(8213L, 513L, 
8060L, 8093L, 2131L, 8200L, 2378L, 8135L, 8156L, 8256L), Dia12 = c(53L, 
29L, NA, NA, 53L, 6L, 20L, NA, 13L, 20L), Dia34 = c(177L, 39L, 
NA, NA, 0L, 77L, 101L, NA, 77L, 95L), Vit34 = c(2L, 1L, NA, NA, 
2L, 1L, 2L, NA, 1L, 1L), Ska1 = c(NA, 542L, NA, NA, 634L, NA, 
NA, NA, NA, NA), Ska2 = c(NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_), Dia34_2 = c(NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_), block1 = c(1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L), block = c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L), x = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2), y = c(1, 
2, 3, 4, 5, 6, 1, 2, 3, 4), id = c("1:1:1", "1:1:2", "1:1:3", 
"1:1:4", "1:1:5", "1:1:6", "1:2:1", "1:2:2", "1:2:3", "1:2:4" 
)), .Names = c("Yta", "Rad", "Planta", "Sortnr", "Dia12", "Dia34", 
"Vit34", "Ska1", "Ska2", "Dia34_2", "block1", "block", "x", "y", 
"id"), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame" 
)) 

我想用dplyr,因为我喜欢它!我尝试了这样的功能,但错误!我觉得事情是与功能的第三个参数不被识别。会很感激小费如何克服这个问题

prMval <- function(df, sort, varia){ 
    df %>% #filter(Sortnr == sort) %>% 
    #filter(!(Rad %in% c(min(Rad), max(Rad))) & !(Planta %in% c(min(Planta), max(Planta)))) %>% 
    top_n(5, varia) 
} 

prMval(df1, 2, Dia34) 


Error: object 'varia' not found 
+0

top_n()的第二个参数是一个数字右边的。因此它必须是df $ Dia34或在函数内部使用df [[Dia34]],然后传递prMval(df,2,“Dia34”) –

+0

我没有收到任何错误,如果我运行代码,它只是返回一个为我清空tibble数据框。 –

+0

现在试一试,我不得不在数据框中观察一些数据,因此过滤器将所有的观察结果作为 – Mateusz1981

回答

1

当使用top_n它查找变量“varia”作为输入列,它不解释varia。通过使用lazyeval包,我们可以确保varia是top_n前解释:

library(lazyeval) 
prMval <- function(df, sort, varia){ 
    tmp <- df #%>% filter(Sortnr == sort) %>% 
    #filter(!(Rad %in% c(min(Rad), max(Rad))) & !(Planta %in% c(min(Planta), max(Planta)))) 

    lazy_eval(interp(~top_n(tmp, 5, varia), varia = as.name(varia))) 
    # Replace varia with the input and then interpret the resulting call 
} 

prMval(df1, 2, "Dia34") # Make sure to pass a character string as varia 

返回:

# A tibble: 5 x 15 
    Yta Rad Planta Sortnr Dia12 Dia34 Vit34 Ska1 Ska2 Dia34_2 block1 block  x  y id 
    <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <dbl> <dbl> <chr> 
1  1  1  1 8213 53 177  2 NA NA  NA  1  1  1  1 1:1:1 
2  1  1  6 8200  6 77  1 NA NA  NA  1  1  1  6 1:1:6 
3  1  2  1 2378 20 101  2 NA NA  NA  1  1  2  1 1:2:1 
4  1  2  3 8156 13 77  1 NA NA  NA  1  1  2  3 1:2:3 
5  1  2  4 8256 20 95  1 NA NA  NA  1  1  2  4 1:2:4 

我还没有想出如何做到这一点管道内的,所以我有分开这些步骤。