0
我有一个不平衡的数据,我想做分层交叉验证,并使用精度调用auc作为我的评估指标。préummaryin r caret package for imbalance data
我在分包索引的r包caret中使用prSummary,并且在计算性能时遇到错误。
以下是可以复制的样本。我发现计算p-r auc只有十个样本,并且由于不平衡,只有一个类,因此它不能计算p-r auc。 (我发现,只有10个检计算PR AUC的原因是因为我修改prSummary迫使此功能打印出的数据)
library(randomForest)
library(mlbench)
library(caret)
# Load Dataset
data(Sonar)
dataset <- Sonar
x <- dataset[,1:60]
y <- dataset[,61]
# make this data very imbalance
y[4:length(y)] <- "M"
y <- as.factor(y)
dataset$Class <- y
# create index and indexOut
seed <- 1
set.seed(seed)
folds <- 2
idxAll <- 1:nrow(x)
cvIndex <- createFolds(factor(y), folds, returnTrain = T)
cvIndexOut <- lapply(1:length(cvIndex), function(i){
idxAll[-cvIndex[[i]]]
})
names(cvIndexOut) <- names(cvIndex)
# set the index, indexOut and prSummaryCorrect
control <- trainControl(index = cvIndex, indexOut = cvIndexOut,
method="cv", summaryFunction = prSummary, classProbs = T)
metric <- "AUC"
set.seed(seed)
mtry <- sqrt(ncol(x))
tunegrid <- expand.grid(.mtry=mtry)
rf_default <- train(Class~., data=dataset, method="rf", metric=metric, tuneGrid=tunegrid, trControl=control)
以下是错误消息:
Error in ROCR::prediction(y_pred, y_true) :
Number of classes is not equal to 2.
ROCR currently supports only evaluation of binary classification tasks.