设定的阈值

我试图计算混淆矩阵后，我进行决策树模型设定的阈值

# tree model 
tree <- rpart(LoanStatus_B ~.,data=train, method='class') 
# confusion matrix 
pdata <- predict(tree, newdata = test, type = "class") 
confusionMatrix(data = pdata, reference = test$LoanStatus_B, positive = "1")

我如何设定门槛，我的困惑martix，说也许我想以上的概率0.2作为默认值，这是二进制输出通道

来源

2017-09-04 Pumpkin C

看一看'？predict.rpart'看，你可以指定'type'什么。 – coffeinjunky

这里需要注意几点。首先，确保在做预测时获得阶级概率。预测类型="class"你刚刚获得离散课程，所以你想要的是不可能的。所以你想让它像"p"下面。

library(rpart) 
data(iris) 

iris$Y <- ifelse(iris$Species=="setosa",1,0) 

# tree model 
tree <- rpart(Y ~Sepal.Width,data=iris, method='class') 

# predictions 
pdata <- as.data.frame(predict(tree, newdata = iris, type = "p")) 
head(pdata) 

# confusion matrix 
table(iris$Y, pdata$`1` > .5)

下一个注意.5这里只是一个任意值 - 你可以改变它到任何你想要的。

我不明白使用confusionMatrix函数的原因，因为这样可以简单地创建混淆矩阵，并且可以实现轻松更改截止点的目标。

说了这么多，如果你想使用confusionMatrix功能，为您的混淆矩阵，那么就创建第一个基于自定义截止这样一个离散的类别预测：

pdata$my_custom_predicted_class <- ifelse(pdata$`1` > .5, 1, 0)

凡，再次。 5是你自定义的选择截止点，可以是你想要的任何东西。

caret::confusionMatrix(data = pdata$my_custom_predicted_class, 
        reference = iris$Y, positive = "1")

Confusion Matrix and Statistics 

      Reference 
Prediction 0 1 
     0 94 19 
     1 6 31 

       Accuracy : 0.8333   
       95% CI : (0.7639, 0.8891) 
    No Information Rate : 0.6667   
    P-Value [Acc > NIR] : 3.661e-06  

        Kappa : 0.5989   
Mcnemar's Test P-Value : 0.0164   

      Sensitivity : 0.6200   
      Specificity : 0.9400   
     Pos Pred Value : 0.8378   
     Neg Pred Value : 0.8319   
      Prevalence : 0.3333   
     Detection Rate : 0.2067   
    Detection Prevalence : 0.2467   
     Balanced Accuracy : 0.7800   

     'Positive' Class : 1

来源

2017-09-04 19:31:14

明白了。非常感谢！一个简单的问题：type =“p”这是在做什么？ –

@PumpkinC不客气， “p”表示概率。这是说给每个班的预测概率 –

回答

相关问题