2011-05-13 182 views
2

创建表比方说,我有这样的一个表:有不同的行尺寸

data <- c(1,2,3,6,5,6,9,"LC","LC","HC","HC","LC","HC","ALL") 
attr(data,"dim") <- c(7,2) 
data 
    [,1] [,2] 
[1,] "1" "LC" 
[2,] "2" "LC" 
[3,] "3" "HC" 
[4,] "6" "HC" 
[5,] "5" "LC" 
[6,] "6" "HC" 
[7,] "9" "ALL" 

现在我要处理的数据,所以它看起来是这样的:

 [,"LC"] [,"HC"] [,"ALL"] 
[1,] "1"  "3"  "9" 
[2,] "2"  "6" 
[3,] "5"  "6" 

有没有一种办法在R中执行此操作还是仅仅是不可能的,我应该尝试其他方式来访问我的数据吗?

+0

'data.frame'和'matrix'(和'array')具有预定的形状('N * M')。只是更明确地把它放在那里。 – 2011-05-13 09:21:01

回答

3

通过使用split,您可以非常接近。此方法返回你想要的值列表,然后你可以使用lapply或任何其他列表操作功能:

split(data[, 1], data[, 2]) 

$ALL 
[1] "9" 

$HC 
[1] "3" "6" "6" 

$LC 
[1] "1" "2" "5" 

如果你必须在矩阵格式的输出,那么我建议你垫NA短向量:

x <- split(data[, 1], data[, 2]) 
n <- max(sapply(x, length)) 

pad_with_na <- function(x, n, padding=NA){ 
    c(x, rep(padding, n-length(x))) 
} 

sapply(x, pad_with_na, n) 

这导致:

 ALL HC LC 
[1,] "9" "3" "1" 
[2,] NA "6" "2" 
[3,] NA "6" "5" 
+0

第一次分割功能正是我所期待的。非常感谢你 – 2011-05-13 08:50:48

0

例DATA

我优先考虑将数据读入data.frame,因为它检查向量长度是否相等。

data <- data.frame(X=c(1,2,3,6,5,6,9), 
        Y=c("LC","LC","HC","HC","LC","HC","ALL")) 

CODE

data <- unstack(data, form=X~Y)# easier to read than split 
Nmax <- do.call(max, lapply(data,length)) 
sapply(data, "[", seq(Nmax))# "borrowed" from other answer in SO