2014-09-02 52 views
0

多个参数,我想计算的加权平均比因素与下面的代码sapply与r中

factor <- factor(cut(var1, quantile(var1, seq(0,1,0.1)))) 
var2_split = split(vat2, factor) 
weight_split = split(weight, factor) 
sapply(var2_split, weighted.mean, weight_split) 

我收到以下错误

Error in FUN(X[[1L]], ...) : 'x' and 'w' must have the same length 

如何格式化我的矢量和权重sapply?

作为示例

假设我有3列的x,y,z,其中x是一组目标值的矩阵M,Y是一组权重,且z是一组值的在其上我想bucket.mean(x,y)。具体而言,我希望weighted.mean(x,y)以z的四分位数为基础。

# Code that doesn't work 

x <- c(1,2,3,4,5,6) 
y <- c(6,3,4,2,3,4) 
z <- c(1,1,2,3,3,4) 
m <- as.matrix(c(x,y,z),nrow=6,ncol=3)) 
# bucket z by quartile. 
z.factor <- cut(z, quantile(z, seq(0,1,0.25)), include.lowest=TRUE) 
x.split = split(x, z.factor) 
y.split = split(y, z.factor) 
# want to bucket weighted.mean(x,y) on quartiles of z 
sapply(x.split, weighted.mean, y.split) 
+2

一次只能扫描一个矢量/列表。如果您想同时迭代allong var2_split和weight_split,请尝试使用“mapply”或“Map”。如果你提供一个[可重现的例子](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)与你的问题,会给出更具体的答案会更容易。 – MrFlick 2014-09-02 19:21:00

+0

以上示例是否适用于mapply? – user196711 2014-09-03 18:50:10

+1

上面的示例不是[可重现](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)。提供一些示例输入数据(即'var1','vat2','weight'等),这样就可以运行并测试看起来像您实际输入的数据。 – MrFlick 2014-09-03 18:52:31

回答

0

与您的特定样本,尝试

#first, note the include.lowest=TRUE to get all values 
z.factor <- factor(cut(z, quantile(z, seq(0,1,0.25)), include.lowest=TRUE)) 

#same 
x.split = split(x, z.factor) 
y.split = split(y, z.factor) 

# here we use mapply 
mapply(weighted.mean, x.split, y.split) 

这给

[1,1.25] (1.25,2.5] (2.5,3]  (3,4] 
1.333333 3.000000 4.600000 6.000000 

这似乎是正确的给你的样品输入。

+0

太好了,谢谢。 – user196711 2014-09-03 20:13:26