0
我想将包含列表(具有可变长度的元素)的矩阵转换为稀疏矩阵。这是一个玩具例子:将列表转换为稀疏矩阵
mOrig = matrix(
c(rep(c('a_b', 'X'), 3),
rep(c('a_b_e', 'X'), 2),
rep(c('a_b_f', 'X'), 1),
rep(c('c_d', 'Y'), 3),
rep(c('c_d_e', 'Y'), 2),
rep(c('c_d_f', 'Y'), 1)),
ncol=2, byrow=TRUE
)
colnames(mOrig) = c('in', 'out')
mOrig
in out
[1,] "a_b" "X"
[2,] "a_b" "X"
[3,] "a_b" "X"
[4,] "a_b_e" "X"
[5,] "a_b_e" "X"
[6,] "a_b_f" "X"
[7,] "c_d" "Y"
[8,] "c_d" "Y"
[9,] "c_d" "Y"
[10,] "c_d_e" "Y"
[11,] "c_d_e" "Y"
[12,] "c_d_f" "Y"
输出矩阵应该是这样的:
a b c d e f X Y
[1,] 1 1 0 0 0 0 1 0
[2,] 1 1 0 0 0 0 1 0
[3,] 1 1 0 0 0 0 1 0
[4,] 1 1 0 0 1 0 1 0
[5,] 1 1 0 0 1 0 1 0
[6,] 1 1 0 0 0 1 1 0
[7,] 0 0 1 1 0 0 0 1
[8,] 0 0 1 1 0 0 0 1
[9,] 0 0 1 1 0 0 0 1
[10,] 0 0 1 1 1 0 0 1
[11,] 0 0 1 1 1 0 0 1
[12,] 0 0 1 1 0 1 0 1
我靠近一个解决方案,但现在看起来完全低效unique(unlist(strsplit()))
和for
循环等。有谁知道一些有效的解决方案,例如,将利用来自Matrix
包的sparseMatrix
(或sparse.model.matrix
)?
非常感谢!
尝试'库(qdapTools); cbind(mtabulate(strsplit(mOrig [,1],“_”)),X = rep(c(1,0),c(6,6)),Y = rep(c(0,1),c 6,6)))' – akrun