2017-04-18 79 views
0

我一直在尝试在组内复制1和2的二进制输出。 我想利用repdplyr,但我似乎无法理解如何在组内应用rep。我已经能够通过手动分开分组并为每个分组指定正确的范围来完成。我想知道如何使用dplyr来应用rep通过dplyr在组内应用rep()

下面是一个示例数据。

df <- data.frame(date = c("2017-01-01", "2017-01-01", "2017-01-01", "2017-01-01", "2017-01-01", "2017-01-01", "2017-01-01", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02", "2017-01-02"), 
       loc =c("AB", "AB", "AB", "AB", "AB", "AB", "AB", "AB", "CD", "CD", "CD", "CD", "CD", "CD", "CD", "CD", "CD", "CD"), 
       cat = c("a", "a", "a", "b", "b", "b", "b", "b", "c", "c", "c", "c", "c", "d", "d", "d", "d", "d")) 

这基本上是我在每个分组上运行的代码应用于整个数据集的代码。

df$type <- rep(1:2,nrow(df)/2) 

正如你所看到的,输出忽略列catcat b & d应在1

  date loc cat type 
1 2017-01-01 AB a 1 
2 2017-01-01 AB a 2 
3 2017-01-01 AB a 1 
4 2017-01-01 AB b 2 
5 2017-01-01 AB b 1 
6 2017-01-01 AB b 2 
7 2017-01-01 AB b 1 
8 2017-01-02 AB b 2 
9 2017-01-02 CD c 1 
10 2017-01-02 CD c 2 
11 2017-01-02 CD c 1 
12 2017-01-02 CD c 2 
13 2017-01-02 CD c 1 
14 2017-01-02 CD d 2 
15 2017-01-02 CD d 1 
16 2017-01-02 CD d 2 
17 2017-01-02 CD d 1 

更新已经开始: 下面是所需的输出。

 date loc cat type 
1 2017-01-01 AB a 1 
2 2017-01-01 AB a 2 
3 2017-01-01 AB a 1 
4 2017-01-01 AB b 1 
5 2017-01-01 AB b 2 
6 2017-01-01 AB b 1 
7 2017-01-01 AB b 2 
8 2017-01-02 AB b 1 
9 2017-01-02 CD c 1 
10 2017-01-02 CD c 2 
11 2017-01-02 CD c 1 
12 2017-01-02 CD c 2 
13 2017-01-02 CD c 1 
14 2017-01-02 CD d 1 
15 2017-01-02 CD d 2 
16 2017-01-02 CD d 1 
17 2017-01-02 CD d 2 
+1

在基地,'DF $类型< - AVE(SEQ(nrow(DF )),df $ cat,FUN = function(x){rep(1:2,length.out = length(x))})'或者如果你先用'cat'排序,'unlist(lapply(table $ cat),function(x){rep(1:2,length.out = x)}))' – alistaire

回答

1

假设cat是这里唯一相关的分组变量(没有日期和LOC),你可以这样做:

library(dplyr) 
df = df %>% 
    group_by(cat) %>% 
    mutate(type = rep(1:2, length.out = length(cat))) 
# Output: 
     date loc cat type 
     <fctr> <fctr> <fctr> <int> 
1 2017-01-01  AB  a  1 
2 2017-01-01  AB  a  2 
3 2017-01-01  AB  a  1 
4 2017-01-01  AB  b  1 
5 2017-01-01  AB  b  2 
6 2017-01-01  AB  b  1 
7 2017-01-01  AB  b  2 
8 2017-01-02  AB  b  1 
9 2017-01-02  CD  c  1 
10 2017-01-02  CD  c  2 
11 2017-01-02  CD  c  1 
12 2017-01-02  CD  c  2 
13 2017-01-02  CD  c  1 
14 2017-01-02  CD  d  1 
15 2017-01-02  CD  d  2 
16 2017-01-02  CD  d  1 
17 2017-01-02  CD  d  2 
18 2017-01-02  CD  d  1 
+0

Thanks @Marius解决了这个问题。 – JnrfL

+2

你可以使用'length.out = n()' – alistaire