2011-05-17 67 views
36

在下面的例子中,我有两个数据集(Z和A)。我想通过ILMN号码合并或组合这些集合。如果没有匹配,请填写NA。rownames合并或合并

z <- matrix(c(0,0,1,1,0,0,1,1,0,0,0,0,1,0,1,1,0,1,1,1,1,0,0,0,"RND1","WDR", "PLAC8","TYBSA","GRA","TAF"), nrow=6, 
    dimnames=list(c("ILMN_1651838","ILMN_1652371","ILMN_1652464","ILMN_1652952","ILMN_1653026","ILMN_1653103"),c("A","B","C","D","symbol"))) 

t<-matrix(c("GO:0002009", 8, 342, 1, 0.07, 0.679, 0, 0, 1, 0, 
     "GO:0030334", 6, 343, 1, 0.07, 0.065, 0, 0, 1, 0, 
     "GO:0015674", 7, 350, 1, 0.07, 0.065, 1, 0, 0, 0), nrow=10, dimnames= list(c("GO.ID","LEVEL","Annotated","Significant","Expected","resultFisher","ILMN_1652464","ILMN_1651838","ILMN_1711311","ILMN_1653026"))) 

其结果将是这样的:

   [,1]   [,2]   [,3]   [,4] 
GO.ID  "GO:0002009" "GO:0030334" "GO:0015674" NA 
LEVEL  "8"   "6"   "7"   NA 
Annotated "342"  "343"  "350"   NA 
Significant "1"   "1"   "1"   NA 
Expected  "0.07"  "0.07"  "0.07"  NA 
resultFisher "0.679"  "0.065"  "0.065"  NA 
ILMN_1652464 "0"   "0"   "1"   PLAC8 
ILMN_1651838 "0"   "0"   "0"   RND1 
ILMN_1711311 "1"   "1"   "0"   NA 
ILMN_1653026 "0"   "0"   "0"   GRA 

回答

34

使用match,使其返回所需载体,然后cbind你的矩阵

cbind(t, z[, "symbol"][match(rownames(t), rownames(z))]) 

      [,1]   [,2]   [,3]   [,4] 
GO.ID  "GO:0002009" "GO:0030334" "GO:0015674" NA  
LEVEL  "8"   "6"   "7"   NA  
Annotated "342"  "343"  "350"  NA  
Significant "1"   "1"   "1"   NA  
Expected  "0.07"  "0.07"  "0.07"  NA  
resultFisher "0.679"  "0.065"  "0.065"  NA  
ILMN_1652464 "0"   "0"   "1"   "PLAC8" 
ILMN_1651838 "0"   "0"   "0"   "RND1" 
ILMN_1711311 "1"   "1"   "0"   NA  
ILMN_1653026 "0"   "0"   "0"   "GRA" 

PS。 被警告t是用于转置矩阵的基本R函数。通过创建一个名为t的变量,它可能会导致您的下游代码混淆。

+0

你的回答是非常有用的感谢。唯一的问题是我的代码没有给出正确的输出。如果我只考虑这个:z [,“symbol”] [match(rownames(t),rownames(z))]一个因子是用NA和符号创建的,但是当我执行cbind时,符号数量被替换为rondom值。有谁知道这是错的?谢谢 – Lisann 2011-05-17 11:19:03

+2

请更正您的PS中的错误。你不会覆盖't'功能。您正在为用户造成混淆,但数据和功能存储在不同的地方。继续,测试它:t < - 矩阵(1:4,2,2); t(t)...有效。 – 2011-05-17 12:13:42

+0

此解决方案是否适用于外连接? – 2016-09-09 13:22:43

3

并不完美,但接近:

newcol<-sapply(rownames(t), function(rn){z[match(rn, rownames(z)), 5]}) 
cbind(data.frame(t), newcol) 
40

使用合并和重命名你的T载体为TT(见Andrie的PS):

merge(tt,z,by="row.names",all.x=TRUE)[,-(5:8)] 

现在,如果你将与dataframes代替矩阵工作,这甚至会变得更容易:

z <- as.data.frame(z) 
tt <- as.data.frame(tt) 
merge(tt,z["symbol"],by="row.names",all.x=TRUE) 
1
cbind.fill <- function(x, y){ 
    xrn <- rownames(x) 
    yrn <- rownames(y) 
    rn <- union(xrn, yrn) 
    xcn <- colnames(x) 
    ycn <- colnames(y) 
    if(is.null(xrn) | is.null(yrn) | is.null(xcn) | is.null(ycn)) 
    stop("NULL rownames or colnames") 
    z <- matrix(NA, nrow=length(rn), ncol=length(xcn)+length(ycn)) 
    rownames(z) <- rn 
    colnames(z) <- c(xcn, ycn) 
    idx <- match(rn, xrn) 
    z[!is.na(idx), 1:length(xcn)] <- x[na.omit(idx),] 
    idy <- match(rn, yrn) 
    z[!is.na(idy), length(xcn)+(1:length(ycn))] <- y[na.omit(idy),] 
    return(z) 
} 
1

你可以用-Andrie答案为通用功能

mbind<-function(...){ 
Reduce(function(x,y){cbind(x,y[match(row.names(x),row.names(y)),])}, list(...)) 
} 

在这里,你可以绑定rownames多帧的关键