2014-03-26 54 views
0

我有一些定性数据,我已将其编入各个类别,并且希望为子组提供摘要。 RQDA软件包非常适合编码采访,但我一直在为开放式调查回复创建摘要。我已经设法将编码文件导出为HTML,并将其复制/粘贴到Excel中。我现在有500行,包含不同列中的所有类别,但是相同的代码可能出现在不同的列中。例如,一些数据:将多个列重新编码为单个变量

a <- c("ResponseA", "ResponseB", "ResponseC", "ResponseD", "NA") 
b <- c("ResponseD", "ResponseC", "NA", "NA","NA") 
c <- c("ResponseB", "ResponseA", "ResponseE", "NA", "NA") 
d <- c("ResponseC", "ResponseB", "ResponseA", "NA", "NA") 
df <- data.frame (a,b,c,d) 

我希望能够像

df$ResponseA <- recode (df$a | df$b | df$c, " 
         'ResponseA' = '1'; 
         else='0' ") 
df$ResponseB <- recode (df$a | df$b | df$c, " 
         'ResponseB' = '1'; 
         else='0' ") 

总之运行的东西,我想扫描9列和重新编码成一个单一的二元变量。

+0

你能告诉的一个小样本你想要的输出?即你期望在'df' data.frame的'ResponseA'和'ResponseB'列中有什么? – hrbrmstr

回答

1

如果我理解正确的问题,或许你可以尝试这样的事:

## Convert your data into a long format first 
dfL <- cbind(id = sequence(nrow(df)), stack(lapply(df, as.character))) 

## The next three lines are mostly cleanup 
dfL$id <- factor(dfL$id, sequence(nrow(df))) 
dfL$values[dfL$values == "NA"] <- NA 
dfL <- dfL[complete.cases(dfL), ] 

## `table` is the real workhorse here 
cbind(df, (table(dfL[1:2]) > 0) * 1) 
#   a   b   c   d ResponseA ResponseB ResponseC ResponseD ResponseE 
# 1 ResponseA ResponseD ResponseB ResponseC   1   1   1   1   0 
# 2 ResponseB ResponseC ResponseA ResponseB   1   1   1   0   0 
# 3 ResponseC  NA ResponseE ResponseA   1   0   1   0   1 
# 4 ResponseD  NA  NA  NA   0   0   0   1   0 
# 5  NA  NA  NA  NA   0   0   0   0   0 

您也可以尝试以下方法:

(table(rep(1:nrow(df), ncol(df)), unlist(df)) > 0) * 1L 
#  
#  NA ResponseA ResponseB ResponseC ResponseD ResponseE 
# 1 0   1   1   1   1   0 
# 2 0   1   1   1   0   0 
# 3 1   1   0   1   0   1 
# 4 1   0   0   0   1   0 
# 5 1   0   0   0   0   0 
+0

辉煌!你为我节省了几天的手动重新编码。 – Greg

+0

@Greg,很高兴听到它。如果答案解决了您的问题,请考虑给予您的投票或接受它。谢谢! – A5C1D2H2I1M1N2O1R2T1

+0

我的歉意 - 这个网站还是比较新的。 – Greg