2016-02-13 56 views
0

假设我有以下data.frame:拉动价值

df <- data.frame(color = c("G","G","G","R","R","R","R","R","R","R","G","G"), 
      trial = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4)) 

如果我想提取以前trialcolor,我会怎么做呢?最终的目标是与data.frame这样结束了:

color trial prevcolor 
1  G  1  <NA> 
2  G  1  <NA> 
3  G  1  <NA> 
4  R  2   G 
5  R  2   G 
6  R  2   G 
7  R  3   R 
8  R  3   R 
9  R  3   R 
10  R  3   R 
11  G  4   R 
12  G  4   R 
+1

'内(DF,prevcolor < - 颜色[匹配(试用 - 1 ,trial)])'适用于你的例子,不确定普遍性 – rawr

回答

1

这里的指数的使用for循环的解决方案:

df <- data.frame(color = c("G","G","G","R","R","R","R","R","R","R","G","G"), 
       trial = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4)) 

# iterate through trial numbers 
for (trial in unique(df$trial)) { 
    # select color of previous trial number 
    prev_color <- as.character(df$color[df$trial == trial - 1])[1] 

    # assign previous color to current trial number 
    df$prevcolor[df$trial == trial] <- prev_color 
} 
df 

## color trial prevcolor 
##1  G  1  <NA> 
##2  G  1  <NA> 
##3  G  1  <NA> 
##4  R  2   G 
##5  R  2   G 
##6  R  2   G 
##7  R  3   R 
##8  R  3   R 
##9  R  3   R 
##10  R  3   R 
##11  G  4   R 
##12  G  4   R 
+0

令人遗憾的是,循环在R中很慢。 – Nick

1

我们可以使用lag(假设“审判”被订购)

df$prevcolor <- with(df, lag(color, n=sum(trial==trial[1L]))) 
df 
# color trial prevcolor 
#1  G  1  <NA> 
#2  G  1  <NA> 
#3  G  1  <NA> 
#4  R  2   G 
#5  R  2   G 
#6  R  2   G 
#7  R  3   R 
#8  R  3   R 
#9  R  3   R 
#10  R  3   R 
#11  G  4   R 
#12  G  4   R 

的@ RAWR的解决方案的变体评价(当 '试' 不是数字列)

Un1 <- unique(df$trial) 
with(df, color[match(factor(trial, levels= Un1, labels = c(NA, head(Un1,-1))), trial)]) 

随着dplyr,我们可以用​​拿到小组

library(dplyr) 
df %>% 
    mutate(prev_color = color[match(group_indices_(.,.dots = 'trial')-1, trial)]) 
# color trial prev_color 
#1  G  1  <NA> 
#2  G  1  <NA> 
#3  G  1  <NA> 
#4  R  2   G 
#5  R  2   G 
#6  R  2   G 
#7  R  3   R 
#8  R  3   R 
#9  R  3   R 
#10  R  3   R 
#11  G  4   R 
#12  G  4   R 
0

下面是一个使用简单merge功能R. 你的数据框另一种解决方案:

df <- data.frame(color = c("G","G","G","R","R","R","R","R","R","R","G","G"), 
       trial = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4)) 

现在使用merge函数。它仅用于合并数据帧。因此:

df2<-merge(data.frame(prevtrial=c(df$trial-1)),unique(df), by.x="prevtrial",by.y="trial",all.x=T) 

现在创建一个新的数据框为输出:

newdf<-data.frame(color=df$color,trial=df$trial,prevtrial=df2$prevtrial,prevcolor=df2$color) 

,这将给:

> newdf 
    color trial prevtrial prevcolor 
1  G  1   0  <NA> 
2  G  1   0  <NA> 
3  G  1   0  <NA> 
4  R  2   1   G 
5  R  2   1   G 
6  R  2   1   G 
7  R  3   2   R 
8  R  3   2   R 
9  R  3   2   R 
10  R  3   2   R 
11  G  4   3   R 
12  G  4   3   R 
>