2012-01-03 124 views
1

我使用fitdist函数在fitdistrplus包在R. 我有以下数据(即我读取使用read.table):拟合分布数据中的R

A <- structure(list(V1 = c(-0.00707717, -0.000947418, -0.00189753, 
-0.000474947, -0.00190205, -0.000476077, 0.00237812, 0.000949668, 
0.000474496, 0.00284226, -0.000473149, -0.000473373, 0, 0, 0.00283688, 
-0.0037843, -0.0047506, -0.00238379, -0.00286807, 0.000478583, 
0.000478354, -0.00143575, 0.00143575, 0.00238835, 0.0042847, 
0.00237248, -0.00142281, -0.00142484, 0, 0.00142484, 0.000948767, 
0.00378609, -0.000472478, 0.000472478, -0.0014181, 0, -0.000946522, 
-0.00284495, 0, 0.00331832, 0.00283554, 0.00141476, -0.00141476, 
-0.00188947, 0.00141743, -0.00236351, 0.00236351, 0.00235794, 
0.00235239, -0.000940292, -0.0014121, -0.00283019, 0.000472255, 
0.000472032, 0.000471809, -0.0014161, 0.0014161, -0.000943842, 
0.000472032, -0.000944287, -0.00094518, -0.00189304, -0.000473821, 
-0.000474046, 0.00331361, -0.000472701, -0.000946074, 0.00141878, 
-0.000945627, -0.00189394, -0.00189753, -0.0057143, -0.00143369, 
-0.00383326, 0.00143919, 0.000479272, -0.00191847, -0.000480192, 
0.000960154, 0.000479731, 0, 0.000479501, 0.000958313, -0.00383878, 
-0.00240674, 0.000963391, 0.000962464, -0.00192586, 0.000481812, 
-0.00241138, -0.00144963)), .Names = "V1", row.names = c(NA, 
-91L), class = "data.frame") 

我跑以下命令:

fitdist(A$V1,"norm",method="mge",gof="CvM") 

和它生成以下内容:

Fitting of the distribution ' norm ' by maximum goodness-of-fit 
Parameters: 
    estimate 
1  NA 
2  NA 
Warning message: 
In pnorm(q, mean, sd, lower.tail, log.p) : NaNs produced 

给出上述错误信息,我跑了如下:

> mu=mean(A$V1) 
> sigma=sd(A$V1) 
> mu 
[1] -0.0003091273 
> sigma 
[1] 0.002051825 
> pnorm(A$V1,mu,sigma) 
[1] 0.0004859313 0.3778682282 0.2194235651 0.4677942525 0.2187728328 
[6] 0.4675752645 0.9048490462 0.7302272325 0.6487379052 0.9377179215 
[11] 0.4681427154 0.4680993016 0.5598779146 0.5598779146 0.9373956798 
[16] 0.0451612910 0.0152074342 0.1559769817 0.1061704134 0.6494763806 
[21] 0.6494350178 0.2914741494 0.8024493726 0.9056899734 0.9874187360 
[26] 0.9043830715 0.2936417791 0.2933.5598779146 0.8009684336 
[31] 0.7300820807 0.9770270687 0.4682727654 0.6483730677 0.2944326177 
[36] 0.5598779146 0.3780342225 0.1082503682 0.5598779146 0.9614622560 
[41] 0.9373152170 0.7995942319 0.2949940199 0.2205866970 0.7999587855 
[46] 0.1583537921 0.9036385181 0.9031740418 0.9027096003 0.3791890228 
[51] 0.2954414771 0.1095934742 0.6483327428 0.6482924162 0.6482520879 
[56] 0.2947687275 0.7997772412 0.3785308577 0.6482924162 0.3784483801 
[61] 0.3782828856 0.2200710780 0.4680124750 0.4679688685 0.9612699580 
[66] 0.4682295443 0.3781172281 0.8001429585 0.3782000541 0.2199411992 
[71] 0.2194235651 0.0042152418 0.2918187280 0.0429384302 0.8029149383 
[76] 0.6496008197 0.2164182554 0.4667778828 0.7319136560 0.6496837100 
[81] 0.5598779146 0.6496421754 0.7316179594 0.0426934572 0.1533157552 
[86] 0.7324331764 0.7322844499 0.2153633562 0.6500594259 0.1527813896 
[91] 0.2891573876 

所以现在我很困惑,为什么我就NaN上面的错误消息。任何人有任何建议可能是什么原因和解决办法?

柯西分布,我曾尝试以下:

`> fitdist(A$V1*10^9,"cauchy",method="mle") 
Error in fitdist(A$V1 * 10^9, "cauchy", method = "mle") : 
    the function mle failed to estimate the parameters, 
       with the error code 100 
In addition: Warning message: 
In dcauchy(x, location, scale, log) : NaNs produced 
> fitdist(A$V1*10^5,"cauchy",method="mle") 
Error in fitdist(A$V1 * 10^5, "cauchy", method = "mle") : 
    the function mle failed to estimate the parameters, 
       with the error code 100 
In addition: Warning message: 
In dcauchy(x, location, scale, log) : NaNs produced 
> fitdist(A$V1*10^5,"cauchy",method="mge",gof="CvM") 
Fitting of the distribution ' cauchy ' by maximum goodness-of-fit 
Parameters: 
    estimate 
1  NA 
2  NA 
Warning message: 
In pcauchy(q, location, scale, lower.tail, log.p) : NaNs produced 
> fitdist(A$V1*10^5,"cauchy",method="mge",gof="AD") 
Fitting of the distribution ' cauchy ' by maximum goodness-of-fit 
Parameters: 
    estimate 
1  NA 
2  NA 
Warning message: 
In pcauchy(q, location, scale, lower.tail, log.p) : NaNs produced 
> fitdist(A$V1*10^9,"cauchy",method="mge",gof="AD") 
Fitting of the distribution ' cauchy ' by maximum goodness-of-fit 
Parameters: 
    estimate 
1  NA 
2  NA 
Warning message: 
In pcauchy(q, location, scale, lower.tail, log.p) : NaNs produced 
> fitdist(A$V1+10^3,"cauchy",method="mle") 
Error in fitdist(A$V1 + 10^3, "cauchy", method = "mle") : 
    the function mle failed to estimate the parameters, 
       with the error code 100 
In addition: Warning message: 
In dcauchy(x, location, scale, log) : NaNs produced 

在解决任何建议,这...谢谢!

+0

如果你打算[交叉后](https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20120102/4f318a8a/attachment.pl),请有礼貌地明确地这样说。否则,您的问题的答案可能分散在多个网站上。 – 2012-01-03 03:36:22

+0

没问题,我很乐意这样做。什么是有用的解决方案:-) – itcplpl 2012-01-03 03:41:37

+0

你可以自由地回滚我的编辑,但我没有实质性地改变你的问题。我让其他人更容易重现您的结果。 – 2012-01-03 03:52:15

回答

5

下面的答案。

library(fitdistrplus) 


A <- structure(list(V1 = c(-0.00707717, -0.000947418, -0.00189753, 
-0.000474947, -0.00190205, -0.000476077, 0.00237812, 0.000949668, 
0.000474496, 0.00284226, -0.000473149, -0.000473373, 0, 0, 0.00283688, 
-0.0037843, -0.0047506, -0.00238379, -0.00286807, 0.000478583, 
0.000478354, -0.00143575, 0.00143575, 0.00238835, 0.0042847, 
0.00237248, -0.00142281, -0.00142484, 0, 0.00142484, 0.000948767, 
0.00378609, -0.000472478, 0.000472478, -0.0014181, 0, -0.000946522, 
-0.00284495, 0, 0.00331832, 0.00283554, 0.00141476, -0.00141476, 
-0.00188947, 0.00141743, -0.00236351, 0.00236351, 0.00235794, 
0.00235239, -0.000940292, -0.0014121, -0.00283019, 0.000472255, 
0.000472032, 0.000471809, -0.0014161, 0.0014161, -0.000943842, 
0.000472032, -0.000944287, -0.00094518, -0.00189304, -0.000473821, 
-0.000474046, 0.00331361, -0.000472701, -0.000946074, 0.00141878, 
-0.000945627, -0.00189394, -0.00189753, -0.0057143, -0.00143369, 
-0.00383326, 0.00143919, 0.000479272, -0.00191847, -0.000480192, 
0.000960154, 0.000479731, 0, 0.000479501, 0.000958313, -0.00383878, 
-0.00240674, 0.000963391, 0.000962464, -0.00192586, 0.000481812, 
-0.00241138, -0.00144963)), .Names = "V1", row.names = c(NA, 
-91L), class = "data.frame") 

#your data are very small 
summary(A$V1) 

#fit dist does not converge with parameter 
fitdist(A$V1,"norm",method="mge",gof="CvM") 

#arguments are correctly specified 
?fitdist 

#equivalent call of mgedist -> same problem 
mgedist(A$V1,"norm",gof="CvM") 

#with uniform distribution it works 
fitdist(A$V1,"unif",method="mge") 

#as well as with mme and mle 
fitdist(A$V1,"norm",method="mme") 
fitdist(A$V1,"norm",method="mle") 

#so the problem comes with the mean or the sd parameters of the normal distribution. 
#as returns a result, sd is the problem 
mgedist(A$V1,"norm",gof="CvM", fix.arg=list(sd=sd(A$V1)), start=list(mean=0)) 

#fixing a lower bound for sd returns a result 
mgedist(A$V1,"norm",gof="CvM", lower=c(-1, .01)) 

#but the appropriate answer to your problem is to rescale your data. 
#it works perfectly. 
mgedist(1000*A$V1,"norm",gof="CvM", lower=c(-1, 1e-3)) 
#we don't even need to use lower bounds. 
mgedist(1000*A$V1,"norm",gof="CvM") 


#looking at the source code of mgedist, one can see, that the distance 
#of Cramer von Mises is defined as follows. 
fnobj <- function(par, fix.arg, obs, pdistnam) { 
       n <- length(obs) 
       s <- sort(obs) 
       theop <- do.call(pdistnam, c(list(q = s), as.list(par), 
        as.list(fix.arg))) 
       1/(12 * n) + sum((theop - (2 * seq(1:n) - 1)/(2 * 
        n))^2) 
      } 

#a NaN is produced with negative sd    
fnobj(c(1,1), NULL, A$V1, pnorm) 
fnobj(c(mean=1,sd=1), NULL, A$V1, pnorm) 
fnobj(c(mean=0,sd=0), NULL, A$V1, pnorm) 
fnobj(c(mean=0,sd=-1), NULL, A$V1, pnorm) 
+0

感谢Christophe,我试图在稍微不同的数据集上运行cauchy,并且我尝试了diff。如下组合,但没有运气。你能不能让我知道修补程序 - 谢谢!我已经使用该信息更新了我的原始帖子。 – itcplpl 2012-01-03 17:40:01

4

在我看来,通过fitdist调用的函数mgedist里面的一个错误:看行

if (!cens) 
    opttryerror <- try(opt <- optim(par = vstart, fn = fnobj, 
     fix.arg = fix.arg, obs = data, pdistnam = pdistname, hessian = TRUE, 
     method = meth, lower = lower, upper = upper, ...), silent = TRUE) 
else 
    stop("Maximum goodness-of-fit estimation is not yet available for censored data.") 

事实上,你是因为该方法的参数传递两次引发错误,一次作为命名参数,另一次在......。错误被捕获,并且您作为输出接收到的只是一个“默认”返回。

与维护人员进行交谈,让它得到修复。

+0

这是有趣的....这是mge的问题,因为mme和mle工作正常 – itcplpl 2012-01-03 04:03:48

+0

有关Cauchy问题的任何想法,因为我尝试过:'fitdist(A $ V1,“cauchy”) Fitdist错误(A $ V1, “柯西”): 函数MLE失败来估计参数, ,错误代码100 此外:警告消息: 在dcauchy(X,位置,规模,登录):NaN的产生 ' – itcplpl 2012-01-03 05:09:04