2016-11-29 53 views
1

我已经运行了网格搜索,并将时代作为超参数之一。现在选择最佳模型后,如何确定为这个特定模型选择了哪个时期?如何从网格搜索结果确定时期超参数

下面是模型 模型详细信息的摘要: ==============

H2OBinomialModel: deeplearning 
Model ID: dl_grid_model_19 
Status of Neuron Layers: predicting Churn, 2-class classification, bernoulli distribution, CrossEntropy loss, 4,226 weights/biases, 44.1 KB, 47,520 training samples, mini-batch size 1 
    layer units    type dropout  l1  l2 mean_rate rate_rms momentum mean_weight weight_rms 
1  1 30   Input 0.00 %                  
2  2 32 RectifierDropout 20.00 % 0.000010 0.000010 0.009995 0.000000 0.501901 -0.011006 0.210611 
3  3 32 RectifierDropout 20.00 % 0.000010 0.000010 0.009995 0.000000 0.501901 -0.035854 0.191687 
4  4 32 RectifierDropout 20.00 % 0.000010 0.000010 0.009995 0.000000 0.501901 -0.029072 0.185352 
5  5 32 RectifierDropout 20.00 % 0.000010 0.000010 0.009995 0.000000 0.501901 -0.057359 0.186863 
6  6  2   Softmax   0.000010 0.000010 0.009995 0.000000 0.501901 0.122655 0.406789 
    mean_bias bias_rms 
1     
2 0.401924 0.136989 
3 0.938406 0.041128 
4 0.950918 0.043826 
5 0.915588 0.060796 
6 0.019925 0.175195 


H2OBinomialMetrics: deeplearning 
** Reported on training data. ** 
** Metrics reported on full training frame ** 

MSE: 0.1946901 
RMSE: 0.441237 
LogLoss: 0.5731371 
Mean Per-Class Error: 0.194215 
AUC: 0.8767996 
Gini: 0.7535992 

Confusion Matrix for F1-optimal threshold: 
     No Yes Error  Rate 
No  1755 614 0.259181 =614/2369 
Yes  308 2075 0.129249 =308/2383 
Totals 2063 2689 0.194024 =922/4752 

Maximum Metrics: Maximum metrics at their respective thresholds 
         metric threshold value idx 
1      max f1 0.216316 0.818218 266 
2      max f2 0.058723 0.889206 348 
3     max f0point5 0.306487 0.801744 216 
4     max accuracy 0.217122 0.805976 265 
5    max precision 0.730797 1.000000 0 
6     max recall 0.006754 1.000000 398 
7    max specificity 0.730797 1.000000 0 
8    max absolute_mcc 0.216316 0.616944 266 
9 max min_per_class_accuracy 0.257957 0.795636 242 
10 max mean_per_class_accuracy 0.217122 0.805792 265 

Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)` 
H2OBinomialMetrics: deeplearning 
** Reported on validation data. ** 
** Metrics reported on full validation frame ** 

MSE: 0.1418929 
RMSE: 0.3766867 
LogLoss: 0.4374728 
Mean Per-Class Error: 0.2603761 
AUC: 0.8306744 
Gini: 0.6613489 

Confusion Matrix for F1-optimal threshold: 
     No Yes Error  Rate 
No  1075 201 0.157524 =201/1276 
Yes  162 284 0.363229 =162/446 
Totals 1237 485 0.210801 =363/1722 

Maximum Metrics: Maximum metrics at their respective thresholds 
         metric threshold value idx 
1      max f1 0.323830 0.610097 183 
2      max f2 0.087110 0.740000 319 
3     max f0point5 0.514027 0.608666 94 
4     max accuracy 0.514027 0.800232 94 
5    max precision 0.668538 0.875000 21 
6     max recall 0.011443 1.000000 389 
7    max specificity 0.717464 0.999216 0 
8    max absolute_mcc 0.323830 0.466764 183 
9 max min_per_class_accuracy 0.229876 0.746082 238 
10 max mean_per_class_accuracy 0.173814 0.753367 273 

Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)` 

回答

2

要找出多少时期使用的模型,最好的办法是看看分数的历史。例如。一个模型m

h2o.scoreHistory(m) 

(或者图形版本,情节模式:plot(m)

这可能是信息太多,所以减少它只是显示出与时代:

h2o.scoreHistory(m)[,c("epochs")] 

(我刚刚注意到h2o.scoreHistory(m)$epochs也会起作用。)

显示返回的最终模型的时代:

last(h2o.scoreHistory(m)[,c("epochs")]) 

顺便说一句,如果你刚刚印刷,你应该已经看到了时代的一列,如果它是你的超参数的一个网格对象。

回答你没有问过的问题:看看早期停止,这将使你免于尝试提前猜测你需要多少个时代,因此也为你节省了一个超参数你的网格搜索。

你也可以简单地让与你正在考虑的最高纪元值模型,并期待在历史得分在每一个你感兴趣的其他时代价值,以获得分数。

相关问题