欲得到两个无规分布的观测x和y的P值,例如:R:计算的随机分布的P值
> set.seed(0)
> x <- rnorm(1000, 3, 2)
> y <- rnorm(2000, 4, 3)
或:
> set.seed(0)
> x <- rexp(50, 10)
> y <- rexp(100, 11)
假设T是我的测试统计量,定义为mean(x) - mean(y)= 0(这是H0),那么P值定义为:p-value = P [T> T_observed | H0成立]。
我试着这样做:
> z <- c(x,y) # if H0 holds then x and y are distributed with the same distribution
> f <- function(x) ecdf(z) # this will get the distribution of z (x and y)
然后计算p值我想这:
> T <- replicate(10000, mean(sample(z,1000,TRUE))-mean(sample(z,2000,TRUE))) # this is
supposed to get the null distribution of mean(x) - mean(y)
> f(quantile(T,0.05)) # calculating the p-value for a significance of 5%
显然,这似乎并没有工作,我失去了什么?