2017-03-03 58 views
0

我在r中一个数据帧和我子集成两个:从另一个数据框中选择具有相同特征的元素?

p<-c(3.14,3.56,7.45,8.33,5.44,3.12,3.78,7.62,9.12,4.34,6.78,8.65,6.99) 
n<-c("mQTL","mQTL","null","null","null","null","null","null","null","null","null","null","null") 
s<-c(2,2,1,2,1,1,2,2,2,1,2,1,2) 
g<-c("female","male","female","male","female","female","male","female","female","male","female","female","female") 
df<-data.frame(n,g,s,p) 
df 


mQTL<-subset(df,df$n=='mQTL') 

mQTL

n  g s p 
1 mQTL female 2 3.14 
2 mQTL male 2 3.56 


null<-subset(df,df$n=="null") 

n  g  s p 
3 null female 1 7.45 
4 null male 2 8.33 
5 null female 1 5.44 
6 null female 1 3.12 
7 null male 2 3.78 
8 null female 2 7.62 
9 null female 2 9.12 
10 null male 1 4.34 
11 null female 2 6.78 
12 null female 1 8.65 
13 null female 2 6.99 

我想随机搜索从空两个元件,其中每个的它们匹配基于性别(df $ g)和数量(df $ s)的两个mQTL

例如,我想有这样的事情第一个随机画

n g  s p 
null female 2 7.62 
null male 2 3.78 

第二随机画

n g  s p 
null female 2 9.12 
null male 2 3.78 

我想随机得出这样的5倍,例如,得到5不同的组合

我试图

null[which((mQTL$g==null$g)& (mQTL$s==null$s)),] 

,但它给了我一个datafram所有的人都没有两届组合电子

 n  g s p 
4 null male 2 8.33 
9 null female 2 9.12 
11 null female 2 6.78 
13 null female 2 6.99 
+1

我不明白。为什么8.33会用于男性排 – Crt

+0

我编了一些数据,你不需要解释实际值。我的实际数据帧比这个大得多。实际上,我有4000个mQTL从null(10000行)中抽样。我希望他们每个人都有基于'性别'和'数字'(s专栏)的相同功能。但我想从null中随机选择4000,他们只需要具有相同的功能(标准)! – dizue

+1

您可能想阅读http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/28481250#28481250重现您的示例有点痛苦,因为您拥有它这里。 – Frank

回答

0
mQTL = subset(df,df$n=='mQTL') 
null = subset(df,df$n=='null') 

# Check if the combination of null$g and null$s matches with that of mQTL$g and mQTL$s 
null$match = paste(null$g, null$s) %in% paste(mQTL$g, mQTL$s) 

# Random sample of two of the matched rows 
null[sample(which(null$match), 2),] 

# > null[sample(which(null$match), 2),] 
#  n  g s p match 
# 13 null female 2 6.99 TRUE 
# 4 null male 2 8.33 TRUE 

要绘制的5倍,你运行一个for循环和存储在列表得出:

draws = list() 
for(ii in 1:5){ 
    draws[[ii]] = null[sample(which(null$match), 2),] 
} 

# > draws 
# [[1]] 
#  n  g s p match 
# 4 null male 2 8.33 TRUE 
# 13 null female 2 6.99 TRUE 
# 
# [[2]] 
#  n  g s p match 
# 11 null female 2 6.78 TRUE 
# 9 null female 2 9.12 TRUE 
# 
# [[3]] 
#  n  g s p match 
# 9 null female 2 9.12 TRUE 
# 8 null female 2 7.62 TRUE 
# 
# [[4]] 
#  n  g s p match 
# 13 null female 2 6.99 TRUE 
# 4 null male 2 8.33 TRUE 
# 
# [[5]] 
#  n  g s p match 
# 7 null male 2 3.78 TRUE 
# 8 null female 2 7.62 TRUE 
+0

非常感谢你!这正是我需要的! – dizue

+0

@dizue如果你认为这回答你的问题,请接受它,让其他人可以看到。 – useR

+0

非常感谢你! – dizue

0

尝试使用merge()功能:

merge(mQTL, null, by.x = c("g","s"), by.y = c("g","s)) 

但你可能要重命名的列,使事情clearier。

+0

对不起,我更新了我的问题。事实上,我需要做一个排列而不是合并 – dizue

相关问题