的R - 创建存在/不存在DF

我具有类似于下面的一个data.frame：的R - 创建存在/不存在DF

Species<-c("a","b","c","d") 
Samples<-c(1,2,3,4,5,6) 

species<-sample(Species,20,replace=TRUE) 

samples=sample(Samples,20,replace=TRUE) 

df <- data.frame(samples,species)

我倒是希望改造它在一个data.frame其中每个种类将显示为一列，并且每个样品将占据一条线。值（0和1）表示存在与否。我的原始data.frame约有。 600k行，60k样本和20个变量（物种）。

来源

2014-11-05 Gil33

您可能在寻找'model.matrix'或'table'。 – A5C1D2H2I1M1N2O1R2T1 2014-11-05 17:46:59

如何：

> reshape2::dcast(df,formula = samples ~ species) 

     samples a b c d 
    1  1 0 0 1 3 
    2  2 0 3 1 0 
    3  3 2 1 0 1 
    4  4 0 1 0 1 
    5  5 1 1 2 1 
    6  6 0 0 0 1

来源

2014-11-05 18:12:19

那些看起来不像1s和0s .... – A5C1D2H2I1M1N2O1R2T1 2014-11-05 18:35:14

@Dominic Comtois它的工作，谢谢你。阿南达，我想这会补充它，对吧？ df <-1*(df> 0） – Gil33 2014-11-05 18:40:39

Np！我错过了只有0-1的部分。如果你使用df> 0的技巧，只要注意样本列，因为它们也会变成1！ – 2014-11-05 19:49:59

由于阿难在评论中已经提到的，你可以使用table，例如：

as.data.frame(with(df, table(samples, species)) > 0L) +0L 
# a b c d 
#1 1 0 1 1 
#2 1 1 0 1 
#3 1 0 1 0 
#4 1 1 1 1 
#5 1 0 0 1 
#6 0 1 1 0

我这里使用的数据是：

Species <- c("a","b","c","d") 
Samples <- 1:6 
set.seed(99) 
df <- data.frame(samples = sample(Samples, 20, replace=TRUE), 
       species = sample(Species, 20, replace=TRUE))

来源

2014-11-05 19:20:51

我想你可以将它简化为as.data.frame（（table（df）> 0L）+ 0L）' – 2014-11-05 19:22:22

@DavidArenburg，对于样本数据为true，但如果真实data.frame包含的数据多于2列，则需要明确调用列。 – 2014-11-05 19:30:16

两种方式都适合我。唯一的区别是，如果没有明确地调用列，那么样本不会变成变量，而在@beginnerR提出的方法中，他们会这样做。非常感谢。 – Gil33 2014-11-06 11:35:21

的R - 创建存在/不存在DF

回答

相关问题