将多个列重新编码为单个变量

我有一些定性数据，我已将其编入各个类别，并且希望为子组提供摘要。 RQDA软件包非常适合编码采访，但我一直在为开放式调查回复创建摘要。我已经设法将编码文件导出为HTML，并将其复制/粘贴到Excel中。我现在有500行，包含不同列中的所有类别，但是相同的代码可能出现在不同的列中。例如，一些数据：将多个列重新编码为单个变量

a <- c("ResponseA", "ResponseB", "ResponseC", "ResponseD", "NA") 
b <- c("ResponseD", "ResponseC", "NA", "NA","NA") 
c <- c("ResponseB", "ResponseA", "ResponseE", "NA", "NA") 
d <- c("ResponseC", "ResponseB", "ResponseA", "NA", "NA") 
df <- data.frame (a,b,c,d)

我希望能够像

df$ResponseA <- recode (df$a | df$b | df$c, " 
         'ResponseA' = '1'; 
         else='0' ") 
df$ResponseB <- recode (df$a | df$b | df$c, " 
         'ResponseB' = '1'; 
         else='0' ")

总之运行的东西，我想扫描9列和重新编码成一个单一的二元变量。

来源

2014-03-26 Greg

你能告诉的一个小样本你想要的输出？即你期望在'df' data.frame的'ResponseA'和'ResponseB'列中有什么？ – hrbrmstr

如果我理解正确的问题，或许你可以尝试这样的事：

## Convert your data into a long format first 
dfL <- cbind(id = sequence(nrow(df)), stack(lapply(df, as.character))) 

## The next three lines are mostly cleanup 
dfL$id <- factor(dfL$id, sequence(nrow(df))) 
dfL$values[dfL$values == "NA"] <- NA 
dfL <- dfL[complete.cases(dfL), ] 

## `table` is the real workhorse here 
cbind(df, (table(dfL[1:2]) > 0) * 1) 
#   a   b   c   d ResponseA ResponseB ResponseC ResponseD ResponseE 
# 1 ResponseA ResponseD ResponseB ResponseC   1   1   1   1   0 
# 2 ResponseB ResponseC ResponseA ResponseB   1   1   1   0   0 
# 3 ResponseC  NA ResponseE ResponseA   1   0   1   0   1 
# 4 ResponseD  NA  NA  NA   0   0   0   1   0 
# 5  NA  NA  NA  NA   0   0   0   0   0

您也可以尝试以下方法：

(table(rep(1:nrow(df), ncol(df)), unlist(df)) > 0) * 1L 
#  
#  NA ResponseA ResponseB ResponseC ResponseD ResponseE 
# 1 0   1   1   1   1   0 
# 2 0   1   1   1   0   0 
# 3 1   1   0   1   0   1 
# 4 1   0   0   0   1   0 
# 5 1   0   0   0   0   0

来源

2014-03-26 16:33:34 A5C1D2H2I1M1N2O1R2T1

辉煌！你为我节省了几天的手动重新编码。 – Greg

@Greg，很高兴听到它。如果答案解决了您的问题，请考虑给予您的投票或接受它。谢谢！ – A5C1D2H2I1M1N2O1R2T1

我的歉意 - 这个网站还是比较新的。 – Greg

将多个列重新编码为单个变量

回答

相关问题