如何合并多个变量以在R中创建新的因子变量？

我有一项调查的数据。它来自一个问题是这样的：如何合并多个变量以在R中创建新的因子变量？

Did you do any of the following activities during your PhD 

          Yes, paid by my school. Yes, paid by me. No. 

Attended an internationl conference? 
Bought textbooks?

的数据自动保存在电子表格中这样说：

id conf.1 conf.2 conf.3 text.1 text.2 text.3 

1 1        1 
2   1    1 
3     1  1 
4     1     1 
5

这意味着参与者1出席了她的大学付出了会议;参加者2参加了他所支付的会议，参与者3没有参加。

我要合并CONF.1，CONF.2和CONF.3和text.1，text.2和text.3单变量

id new.conf new.text 

1 1  2 
2 2  1 
3 3  1 
4 3  3 

where the number now respresents the categories of the survey question 

Thanks for your help

来源

2012-07-22 Bartolome Salom

这是一个重塑不合并。尝试'reshape'（base R），'reshapeasy'（taRifx package）或'reshape2'软件包。 – 2012-07-22 21:57:28

你没有说明是否每个一组问题可以有多个答案。如果是这样，这种方法可能不适合你。如果是这样的话，我建议在继续之前提出更多的问题reproducible。与该警告的出路，给这个一抡：

library(reshape2) 
#recreate your data 
dat <- data.frame(id = 1:5, 
        conf.1 = c(1,rep(NA,4)), 
        conf.2 = c(NA,1, rep(NA,3)), 
        conf.3 = c(NA,NA,1,1, NA), 
        text.1 = c(NA,1,1,NA,NA), 
        text.2 = c(1, rep(NA,4)), 
        text.3 = c(rep(NA,3),1, NA)) 

#melt into long format 
dat.m <- melt(dat, id.vars = "id") 
#Split on the "." 
dat.m[, c("variable", "val")] <- with(dat.m, colsplit(variable, "\\.", c("variable", "val"))) 
#Subset out only the complete cases 
dat.m <- dat.m[complete.cases(dat.m),] 
#Cast back into wide format 
dcast(id ~ variable, value.var = "val", data = dat.m) 
#----- 
    id conf text 
1 1 1 2 
2 2 2 1 
3 3 3 1 
4 4 3 3

来源

2012-07-22 22:23:41 Chase

谢谢大家的回答。 – 2012-07-23 20:30:26

这里有一个基础的方法，将缺失值处理：

confvars <- c("conf.1","conf.2","conf.3") 
textvars <- c("text.1","text.2","text.3") 

which.sub <- function(x) { 
maxsub <- apply(dat[x],1,which.max) 
maxsub[(lapply(maxsub,length)==0)] <- NA 
return(unlist(maxsub)) 
} 

data.frame(
id = dat$id, 
conf = which.sub(confvars), 
text = which.sub(textvars) 
)

结果：

id conf text 
1 1 1 2 
2 2 2 1 
3 3 3 1 
4 4 3 3 
5 5 NA NA

来源

2012-07-22 22:50:46 thelatemail

谢谢。我还有一个问题：是否可以将重新塑造的表格转换为Latex的表格，以显示每个级别的名称（例如1 =由我的机构赞助; 2 =由不同机构赞助; 3 =否） – 2012-07-23 21:21:16

以下解决方案非常简单，我使用它很多。让我们使用上述相同的数据框Chase。

dat <- data.frame(id = 1:5, 
        conf.1 = c(1,rep(NA,4)), 
        conf.2 = c(NA,1, rep(NA,3)), 
        conf.3 = c(NA,NA,1,1, NA), 
        text.1 = c(NA,1,1,NA,NA), 
        text.2 = c(1, rep(NA,4)), 
        text.3 = c(rep(NA,3),1, NA))

现在我们开始用零代替NA。

dat[is.na(dat)] <- 0

将每列乘以不同的数字可以让我们简单地计算新变量。

dat <- transform(dat, conf=conf.1 + 2*conf.2 + 3*conf.3, 
         text=text.1 + 2*text.2 + 3*text.3)

让我们重新编写零点在我们的新的变量（或这里整个数据集），以NA并完成。

dat[dat == 0] <- NA 

> dat 
    id conf.1 conf.2 conf.3 text.1 text.2 text.3 conf text 
1 1  1  NA  NA  NA  1  NA 1 2 
2 2  NA  1  NA  1  NA  NA 2 1 
3 3  NA  NA  1  1  NA  NA 3 1 
4 4  NA  NA  1  NA  NA  1 3 3 
5 5  NA  NA  NA  NA  NA  NA NA NA

来源

2013-12-04 21:02:49

如何合并多个变量以在R中创建新的因子变量？

回答

相关问题