2012-04-03 61 views
2

我有结构性这样my.list[[file.id]][value.id]]<-a value(1 or 0)列表。相同的value.id可以存在于不同的file.ids中。R:重新组织列表转换矩阵

我需要rownames所有value.ids的矩阵,colnames是file.ids并且每个小区是my.list[[file.id]][[value.id]]

有没有一种快速的方法来做到这一点,而不会像疯了似的迭代?

示例数据:

列表:

$`Zhou_et_al_2004` 
    CDC42:P60953 CDK2D:NONAME MAPK12:P53778 E2F3:NONAME GRB2:P62424 GRB2:P62993  RFA:NONAME 
      "up"   "up"   "down"   "down"   "down"   "down"   "down" 
    CDK9:P50750 JUP/DP3:NONAME MEK1:NONAME RFC38:NONAME  DP2:NONAME RFC37:NONAME GADD45:NONAME 
     "down"   "down"   "down"   "down"   "down"   "down"   "down" 

$`Zhou_et_al_2006` 
    CTTN:Q14247 GTSE1:Q9NYZ3  CHST11:Q9N  CHST11:PF2 TNRC6A:Q8NDV7 MMP9:P14780  NRIP3:Q9N 
      "up"   "up"   "up"   "up"   "up"   "up"   "up" 
    NRIP3:Q35 EGFR:P00533 GFPT2:NONAME TPCN2:Q8NHX9  BBP:NONAME SQLE:Q14534 DISP2:NONAME 
      "up"   "up"   "up"   "up"   "up"   "up"   "up" 
    PAPPA:Q13219 BMP2:P12643 PCM1:Q15154 SUCLG2:Q96I99 ASAH1:Q13510 UQCRC2:P22695 MTUS1:NONAME 
      "up"   "up"   "down"   "down"   "down"   "down"   "down" 
    MUC20:NONAME FRAT2:NONAME PLA2G4A:P47712 
     "down"   "down"   "down" 

$`Zhou_et_al_2007` 
    CTTN:Q14247 GTSE1:Q9NYZ3  CHST11:Q9N  CHST11:PF2 TNRC6A:Q8NDV7  NRIP3:Q9N 
      "up"   "up"   "up"   "up"   "up"   "up" 
     NRIP3:Q35 USP32:Q8NFA0 PPFIBP1:Q86W92 MALAT1:NONAME TRA2A:NONAME MGC17624:NONAME 
      "up"   "up"   "up"   "up"   "up"   "up" 
    SLC6A2:P23975 USP42:Q9H9J4 RASEF:NONAME SEMA3C:Q99985  NDE1:Q9NXR1  TRA1:NONAME 
      "up"   "up"   "up"   "up"   "up"   "up" 
    PPFIA1:Q13136 PPFIA1:Q16787 ITGA9:Q13797 ITGA9:Q14469  LMO2:P25791 NR2F2:P24468 
      "up"   "up"   "down"   "down"   "down"   "down" 
KIAA0882:NONAME  PCM1:Q15154  CYB5:NONAME  IDH1:NONAME MYLIP:Q8WY64 ASAH1:Q13510 
     "down"   "down"   "down"   "down"   "down"   "down" 
    HADHSC:NONAME FAM84B:Q96KN1  ADH5:P11766  NTN4:Q9HB63  AK3:Q9UIJ7 MTUS1:NONAME 
     "down"   "down"   "down"   "down"   "down"   "down" 
KIAA1815:NONAME 
     "down" 

MATRIX:

   Zhou2004 Zhou2006 Zhou2007 
CDC42:P60953 "up"  NA  NA  
CDK2D:NONAME "up"  NA  NA  
MAPK12:P53778 "down" NA  NA  
E2F3:NONAME  "down" NA  NA  
GRB2:P62424  "down" NA  NA  
GRB2:P62993  "down" NA  NA  
RFA:NONAME  "down" NA  NA  
CDK9:P50750  "down" NA  NA  
JUP/DP3:NONAME "down" NA  NA  
MEK1:NONAME  "down" NA  NA  
RFC38:NONAME "down" NA  NA  
DP2:NONAME  "down" NA  NA  
RFC37:NONAME "down" NA  NA  
GADD45:NONAME "down" NA  NA  
CTTN:Q14247  NA  "up"  "up"  
GTSE1:Q9NYZ3 NA  "up"  "up"  
CHST11:Q9N  NA  "up"  "up"  
CHST11:PF2  NA  "up"  "up"  

等(将有更多的行)

+0

请添加一些示例数据作为输入和预期输出。 – Chase 2012-04-03 15:15:23

+1

你能'dput'样本数据,使其更容易在粘贴? – James 2012-04-03 16:05:28

回答

2

与@ flodel的样本数据开始,

my.list <- list() 
my.list[["Zhou_et_al_2004"]]["CDC42:P60953"] <- 1 
my.list[["Zhou_et_al_2004"]]["CDK2D:NONAME"] <- 2 
my.list[["Zhou_et_al_2006"]]["CTTN:Q14247"] <- 3 
my.list[["Zhou_et_al_2006"]]["GTSE1:Q9NYZ3"] <- 4 
my.list[["Zhou_et_al_2006"]]["CHST11:Q9N"] <- 5 
my.list[["Zhou_et_al_2009"]]["CTTN:Q14247"] <- 6 

使列表中的每个元素到一个数据帧,

a <- lapply(seq_along(my.list), function(i) { 
    x <- my.list[[i]] 
    out <- data.frame(name=names(x), out=x) 
    names(out)[2] <- names(my.list)[[i]] 
    out 
}) 

合并所有数据帧一起,

out <- Reduce(function(x,y) { merge(x, y, all=TRUE) }, a) 

并修复rownames。

rownames(out) <- out[,1] 
out <- out[,-1] 

结果如下!

> out 
      Zhou_et_al_2004 Zhou_et_al_2006 Zhou_et_al_2009 
CDC42:P60953    1    NA    NA 
CDK2D:NONAME    2    NA    NA 
CHST11:Q9N    NA    5    NA 
CTTN:Q14247    NA    3    6 
GTSE1:Q9NYZ3    NA    4    NA 
+0

谢谢!这工作完美 – JoshDG 2012-04-03 19:35:23

4

ldplyplyr包是特别有用的为这种任务。从DOC:

当.fun返回一个数据帧取得的最明确的行为 - ,其中rbind.fill是此方便的功能结合data.frames在这种情况下,片将与rbind.fill. *

组合与NA填补丢失的数据。

所以这里的关键是要申请,将您的列表元素为data.frame函数:

my.list <- list() 
my.list[["Zhou_et_al_2004"]]["CDC42:P60953"] <- 1 
my.list[["Zhou_et_al_2004"]]["CDK2D:NONAME"] <- 2 
my.list[["Zhou_et_al_2006"]]["CTTN:Q14247"] <- 3 
my.list[["Zhou_et_al_2006"]]["GTSE1:Q9NYZ3"] <- 4 
my.list[["Zhou_et_al_2006"]]["CHST11:Q9N"] <- 5 

library(plyr) 
ldply(my.list, .fun = function(x)as.data.frame(as.list(x))) 
#    .id CDC42.P60953 CDK2D.NONAME CTTN.Q14247 GTSE1.Q9NYZ3 CHST11.Q9N 
# 1 Zhou_et_al_2004   1   2   NA   NA   NA 
# 2 Zhou_et_al_2006   NA   NA   3   4   5 

我相信你会知道如何将其转换为最终格式。