2
在caret包中,有一个名为trainControl的东西,它允许我们执行各种交叉验证。为了执行10倍交叉验证中,应当使用trainControl in caret package
fitControl <- trainControl(method= "repeatedcv", number = 10, repeats = 10)
fitJ48_10_fold <- train(x = x, y =y, method = "J48", trControl= fitControl)
而对于训练集,它是
fitControl <- trainControl(method= "none")
fitJ48train <- train(x = x, y =y, method = "J48", trControl= fitControl)
然而,这些模型的混淆矩阵显示相同的用于10倍和训练。
Activity <- predict(fitJ48_10_fold, newdata = Train)
confusionMatrix(Activity, Train$Activity)
Activity <- predict(fitJ48train, newdata = Train)
confusionMatrix(Activity, Train$Activity)
我用WEKA分类GUI和确实J48的10倍交叉验证的性能比训练集的低。我怀疑从插入的trainControl不起作用,或者我错误地传递了这个错误吗?
您能否提供一些可重现的数据? – cdeterman 2015-02-11 14:17:54
是的,感谢您的及时回复和R社区。数据可通过以下链接访问。 https://github.com/Rnewbie/LikitMorganFP/blob/master/cdetermanrequest.csv [链接](https://github.com/Rnewbie/LikitMorganFP/blob/master/cdetermanrequest.csv) – 2015-02-11 16:13:00