2017-08-07 56 views
1

我有一个数据框分组,并且每个组具有相同数量的观察值。我已经为每个组随机分配了1或0的值。对于给定值为1的组中的所有观察值,我想要用变量ysp填充一定数量的随机1和0。对于分配给0的组,我希望同样的变量ysp用全0填充。我如何得到我的数据的一个子集1和0的随机分配,其余为0的R

这是我到目前为止的代码:

rm(list=ls(all=TRUE)) 

set.seed(1984) 
ngroup <- 50 # Number of groups 
obs <- 50  # Number of observations per group 
pgroup <- 0.5 # (1 - p) probability of groups with at least 1 non zero obs (only works if the answer is a round number) 
p <- 0.5 # Once chosen the number of groups I want to have with at least one non zero obs, I want p% of 1s in those groups. 

constantdata <- data.frame(id=1:ngroup) 

dummies <- c(0,1) 
dummies[sample(1:nrow(constantdata), nrow(constantdata), FALSE)] <- rep(dummies, c(pgroup*ngroup,(1-pgroup)*ngroup)) 
constantdata["probgr"] <- dummies 

fulldata <- constantdata[rep(1:ngroup, each=obs),] 

fulldata$ys <- rnorm(ngroup*obs) 

#This is how I try to do it 

if(fulldata$probgr=1){ 
fulldata$ysp[fulldata$ys > quantile(fulldata$ys, 1 - p)] <- 1 
fulldata$ysp[fulldata$ys <= quantile(fulldata$ys, 1 - p)] <- 0 
}else{ 
fulldata$ysp=0} 

当然,这是行不通的。 我希望变量ysp有随机分配p%1s和0s的组的50%(pgroup%)随机组全部为0,另一组为50%(1 - pgroup%)。

回答

0

你在哪写的if(fulldata$probgr=1)你的意思可能是if(fulldata$probgr==1)(平等测试,不是分配)。另外,如果不是矢量操作。一种方式来获得你想要的只是设置在YSP一切都为0,然后改变与probgr == 1中所随意,像这样的内容:

fulldata$ysp = 0 
fulldata$ysp[fulldata$probgr == 1] = sample(0:1, sum(fulldata$probgr == 1), replace=TRUE) 
+0

这是一个非常优雅的解决方案。只有一个问题。我添加了样本()函数结尾的概率(否则它默认为.5): fulldata $ ysp = 0 fulldata $ ysp [fulldata $ probgr == 1] = sample(0:1,sum( fulldata $ probgr == 1),replace = TRUE,prob = c(1-p,p)) 但它并不完全给出1和0的50%,而是每次都改变我认为的概率函数)。 – Quixo1986

相关问题