如何通过交叉验证在svmlight中获得训练的准确性

我想在使用SVMlight的训练集上运行交叉验证。看来这个选项是-x 1（尽管我不确定它实现了多少折叠......）。输出是：如何通过交叉验证在svmlight中获得训练的准确性

XiAlpha-estimate of the error: error<=31.76% (rho=1.00,depth=0) 
XiAlpha-estimate of the recall: recall=>68.24% (rho=1.00,depth=0) 
XiAlpha-estimate of the precision: precision=>69.02% (rho=1.00,depth=0) 
Number of kernel evaluations: 56733 
Computing leave-one-out **lots of gibberish here** 
Retrain on full problem..............done. 
Leave-one-out estimate of the error: error=12.46% 
Leave-one-out estimate of the recall: recall=86.39% 
Leave-one-out estimate of the precision: precision=88.82% 
Actual leave-one-outs computed: 412 (rho=1.00) 
Runtime for leave-one-out in cpu-seconds: 0.84

我怎样才能得到准确性？来自estimate of the error？

谢谢！

来源

2014-03-24 Cheshie

这些都是矛盾的概念。训练错误是训练集上的错误，而交叉验证用于近似验证错误（在用于训练的数据而不是上）。

您的输出表明您正在使用导致所谓的“遗漏一次”验证（只有一个测试点！）的N折叠（其中N大小的训练集合），这会高估模型的质量。你应该尝试10倍，你的准确性只是1错误。

来源

2014-03-24 20:23:45 lejlot

谢谢@lejlot，但我恐怕我不理解你。 1.交叉验证不用于训练数据？（是[这]（http://en.wikipedia.org/wiki/Cross-validation_（统计）#K-fold_cross-validation）不正确？）2.我不明白你答案的第二部分 - 我没有我不知道如何决定使用多少褶皱，为什么它高估了模型的质量，什么是1错误......我真的很抱歉......如果你能解释一些我真的很感激它。非常感谢！ – Cheshie

按照顾名思义，训练数据是用于**训练**的数据。虽然交叉验证会将您的数据分解为训练和测试，但在训练阶段从未看到前者。你已经写下了“培训准确性” - 这是错误的，而不是关于使用CV数据的句子。训练准确性不是**通过CV测量。 “1错误”的意思是“从1减去错误值，你会得到准确性”。最后 - 留下一个“高估”，因为它返回比它应该更高的准确性。 “合理”的折叠次数是10次（通常在这种情况下使用）。 – lejlot

啊......现在我明白了@lejlot，谢谢，但是 - 恕我直言，我认为交叉验证在_training_阶段用于测试目的，不是吗？关于使用10倍的说法 - 我同意，但是一次性估计值（我认为）是svmlight允许的唯一交叉验证形式。 – Cheshie

如何通过交叉验证在svmlight中获得训练的准确性

回答

相关问题