2017-04-25 194 views
0

我正在使用vegan包中的simper函数。简而言之,simper比较了一组组,并计算了哪些变量对它们的不相似性贡献最大,以及在给出累积贡献的名为cusum的列中计算了多少变量。输出是每个组间对比度和结果的嵌套列表。例如。手动从数据帧中构建SIMPER对比度矩阵R

library(vegan) 
library(data.table) 
library(tidyr) 

data(dune) 
data(dune.env) 
sim <- with(dune.env, simper(dune, Management)) 
simsum<-summary(sim) 

#(short version of output) 

$SF_BF 
      average   sd  ratio  ava  avb  cumsum 
Agrostol 0.061373875 0.034193273 1.7949108 4.6666667 0.0000000 0.09824271 
Alopgeni 0.052667124 0.036475863 1.4438897 4.3333333 0.6666667 0.18254830 
$SF_HF 
      average   sd  ratio  ava avb  cumsum 
Agrostol 0.047380081 0.031272715 1.5150613 4.6666667 1.4 0.08350879 
Alopgeni 0.046433015 0.032896891 1.4114712 4.3333333 1.6 0.16534834 
$SF_NM 
      average   sd  ratio  ava  avb cumsum 
Poatriv 0.078284148 0.040947182 1.9118324 4.6666667 0.0000000 0.1013601 
Alopgeni 0.071219425 0.046958337 1.5166513 4.3333333 0.0000000 0.1935731 

由此看来,我对1)每个嵌套列表的名称(即,基团被对比),2)的rownames(即哪些变量有助于相异),和3)的cusum专栏(即他们贡献了多少)。

我想把它变成一个对比矩阵,显示每个组间对比的前3个贡献变量,这样它更容易阅读并且不占用太多空间。下面是我在Excel中做出了表率:

enter image description here

我怀疑这将是棘手的,但是这是我到目前为止有:

top3<-lapply(simsum, `[`,1:3,)#get top 3 contributors 
cuss<-lapply(top3, `[`,6)#get last column 

rows<-lapply(top3, rownames)#get names from list 
rows2<-lapply(cuss, cumsum)#get values from list 


rowsdf<-do.call(rbind, lapply(rows, data.frame, stringsAsFactors=FALSE))#names into df 

cusumdf<-do.call(rbind, lapply(rows2, data.frame, stringsAsFactors=FALSE))#values into df 

simperdf<-cbind(rowsdf,cusumdf) #combine into one df 

colnames(simperdf)<-c('name','cusum') #change colnames 

setDT(simperdf, keep.rownames = TRUE)[]#convert rownames to a column 

simperdf<-separate(data = simperdf, col = rn, into = c("left", "right"), sep = "\\_")#seperate contrasts names 
simperdf<-separate(data = simperdf, col = right, into = c("right", "delete"), sep = "\\.")#separate numbers 
simperdf$delete<-NULL#delete number column 

哪个给出了这样的整洁的小数据帧:

left right  name  cusum 
1: SF BF Agrostol 0.09824271 
2: SF BF Alopgeni 0.28079100 
3: SF BF Lolipere 0.54036058 
4: SF HF Agrostol 0.08350879 
5: SF HF Alopgeni 0.24885713 
6: SF HF Lolipere 0.48820643 
7: SF NM Poatriv 0.10136013 
8: SF NM Alopgeni 0.29493318 
9: SF NM Agrostol 0.56167145 
10: BF HF Rumeacet 0.08163219 
11: BF HF Poatriv 0.23357016 
12: BF HF Planlanc 0.45275349 
13: BF NM Lolipere 0.12427183 
14: BF NM Poatriv 0.32348443 
15: BF NM Poaprat 0.59466001 
16: HF NM Poatriv 0.09913221 
17: HF NM Lolipere 0.27381681 
18: HF NM Rumeacet 0.51298871 

但我不知道该从哪里出发。我看到contrasts(dune.env$Management)将给矩阵框架:

HF NM SF 
BF 0 0 0 
HF 1 0 0 
NM 0 1 0 
SF 0 0 1 

但我不知道如何手动填充它。任何帮助将不胜感激。

+1

你[大多]不能做R中的双头,所以我不确定有可能建立你想要的。你可以创建一个很好的数组:'xtabs(cusum〜left + right + name,df)'...但它很稀疏。 – alistaire

回答

1

这不是正是你正在寻找,但我认为这是在正确方向上的方式:

require(tables) 
test <- data.frame(left = c("SF", "SF", "BF", "BF"), 
        right = c("BF","BF", "SF", "SF"), 
        name = c("Agrostol", "Alopgeni","Agrostol", "Alopgeni2"), 
        cumv = c(1,2,3,4)) 
tabular(right * name ~ left * cumv * mean, data = test) 

给人的输出:

    left  
       BF SF 
       cumv cumv 
right name  mean mean 
BF Agrostol NaN 1 
     Alopgeni NaN 2 
     Alopgeni2 NaN NaN 
SF Agrostol 3 NaN 
     Alopgeni NaN NaN 
     Alopgeni2 4 NaN