2017-04-11 57 views
0

我有一个df,看起来像这样。如果另一列有精确字符串,则将值赋给新列

Date Winner 
4/12 Tom 
4/13 Abe 
4/14 George 
4/15 Tom 

我想补充一点,分配1,如果如果名字出现在冠军列和0,如果名字并没有出现,反之亦然新列。理想情况下,df看起来像这样结果

Date Winner Tom_Win Tom_Lose Abe_Win Abe_Lose George_Win George Lose  
4/12 Tom  1   0   0   1   0   1 
4/13 Abe  0   1   1   0   0   1 
4/14 George 0   1   0   1   1   0 
4/15 Tom  1   0   0   1   0   1 

有没有简单的方法来实现这个目标?

+0

这工作完美!我收到的很多很好的答案之一 –

回答

2

如果您使用model.matrix函数,它会非常简单,当名称没有出现时它会创建N个虚拟列,当它出现时,会创建N个虚拟列(完全按照您的要求),代码如下: (假设你的数据被称为DB)

> winners <- model.matrix(~Winner - 1, data=db) 
> winners 

    WinnerAbe WinnerGeorge WinnerTom 
1   0   0   1 
2   1   0   0 
3   0   1   0 
4   0   0   1 

该位是计算列与失败值

winners <- as.data.frame(winners) 
winners$loserAbe <- as.numeric(!winners$WinnerAbe) #naturally you have to 
                #do this for every column you need 
    WinnerAbe WinnerGeorge WinnerTom loserAbe 
1   0   0   1  1 
2   1   0   0  0 
3   0   1   0  1 
4   0   0   1  1 

winners$Date <- db$Date #this last bit so you don't lose the date. 
+0

这完美的工作!我收到的很多很好的答案之一 –

1

我敢肯定有比这更好的方式,但这部作品在基础R这很简单:

如果你的数据是这样的:

df <- data.frame(Date = c("4/12","4/13","4/14","4/15"),Winner = c("Tom","Abe","George","Tom")) 

追加像这样的额外列:

xcols <- c(paste0(unique(df$Winner), '_Win'), paste0(unique(df$Winner), '_Lose')) 
df[ , xcols] <- 0 

现在做与说明书的文字载体,分次给予每个球员。

evl <- unlist(lapply(unique(df$Winner), function(x){paste0('df[', which(df$Winner == x), ',', which(names(df) == paste0(x, '_Win')), '] <- 1')}))

并执行代码:

eval(parse(text = evl)) 
2

使用mtabulateqdapTools包装我们可以做以下三个步骤,

library(qdapTools) 

d1 <- mtabulate(d3$Winner) 

d2 <- setNames(data.frame(sapply(d1, function(i) ifelse(i == 1, 0, 1))), 
                 paste0(names(d1), '_Lose')) 

cbind(d3$Date, d1, d2) 

# d3$Date Abe George Tom Abe_Lose George_Lose Tom_Lose 
#1 4/12 0  0 1  1   1  0 
#2 4/13 1  0 0  0   1  1 
#3 4/14 0  1 0  1   0  1 
#4 4/15 0  0 1  1   1  0 

DATA

str(d3) 
'data.frame': 4 obs. of 2 variables: 
$ Date : Factor w/ 4 levels "4/12","4/13",..: 1 2 3 4 
$ Winner: Factor w/ 3 levels "Abe","George",..: 3 1 2 3 
+0

这完美的工作!我收到的许多重要答案之一 –

1
df <- data.frame(
    Date = c("4/12", "4/13","4/14", "4/15"), 
    Winner = c("Tom", "Abe", "George", "Tom") 
) 


df2 <- do.call(cbind, 
     lapply(seq_along(levels(df$Winner)), function(x) { 

     win <- ifelse(df$Winner == levels(df$Winner)[x], 1, 0) 
     lose <- ifelse(df$Winner == levels(df$Winner)[x], 0, 1) 

     dat <- cbind(win, lose) 
     colnames(dat) <- c(paste(levels(df$Winner)[x], "win", sep = "_"), paste(levels(df$Winner)[x], "lose", sep = "_")) 

     dat 
    }) 
) 


cbind(df, df2) 


> cbind(df, df2) 
    Date Winner Abe_win Abe_lose George_win George_lose Tom_win Tom_lose 
1 4/12 Tom  0  1   0   1  1  0 
2 4/13 Abe  1  0   0   1  0  1 
3 4/14 George  0  1   1   0  0  1 
4 4/15 Tom  0  1   0   1  1  0 
相关问题