2011-05-18 90 views
-1

我有一个关于下列数据帧的问题:R滤镜按列

genes <- matrix(c("chr1","chr2","chr2","chr2","chr2","chr2", 
       "uc001upw.2","uc001upw.2","uc001upw.2","uc001upx.1","uc001upy.1","uc001upz.1", 
       "188001308","188001308","188001308","188037202","188037202","188037202", 
       "188021266","188021266","188021266","188086618","188127464","188127464", 
       "-","-","-","-","-","-", 
       "CARCRL","CALCRL","CALCRL","TFPI","TFPI","TFPI", 
       "uc001upx.1","uc00upy.1","uc001upz.1","uc001upw.2","uc001upw.2","uc001upw.2", 
       "188037202","188037202","188037202","188001308","188001308","188001308", 
       "188086618","188127464","188127464","188021266","188021266","188021266", 
       "-","-","-","-","-","-", 
       "TFPI","TFPI","TFPI","CALCRL","CALCRL","CALCRL", 
       "35894","35894","35894","35894","35894","35894"), nrow=6) 

colnames(genes)<- c("chr","names.x","start.x","stop.x","strand.x","alias.x","name.y","start.y","stop.y","strand.y", "alias.y", "distance_startsite") 
genes<-as.data.frame(genes) 

在你可以看到,前三行是由names.x和names.y独特的数据帧。 第4,5和6行不是唯一的,它们仅以相反的方式显示。 我的问题是:有没有办法来过滤?

谢谢你! 萨曼莎

+4

所以它服务于n更大的人口= 1,其中n =请您概括这个问题。 – Chase 2011-05-18 14:56:03

回答

1

没有做到这一点我可以肯定,但它能够完成任务的最漂亮的方式:

genes[!duplicated(t(apply(genes[,c('names.x','name.y')],1,sort))),] 
+0

感谢您的答案,但是当我运行代码时,我创建了一个有4行的数据框。第2行和第4行相同: – samantha 2011-05-19 06:33:14

+0

chr names.x start.x stop.x strand.x alias.x name.y start.y 1 chr1 uc001upw.2 188001308 188021266 - CARCRL uc001upx.1 188037202 2 chr2 uc001upw。 2 188001308 188021266 - CALCRL uc00upy.1 188037202个 3 CHR2 uc001upw.2 188001308 188021266 - CALCRL uc001upz.1 188037202 5 CHR2 uc001upy.1 188037202 188127464 - TFPI uc001upw.2 188001308 stop.y strand.y alias.y distance_startsite 1 TFPI 35894 2 188127464 - TFPI 35894 3 188127464 - TFPI 35894 5 188021266 - CALCRL 35894 – samantha 2011-05-19 06:34:31

+0

IT适用于您的样本数据集。 – 2011-05-19 08:14:31