NLP：给出错误结果的分类。如何发现NLP分类的结果是错误的？

我已经开始学习自然语言处理，并已开始磕磕绊绊。NLP：给出错误结果的分类。如何发现NLP分类的结果是错误的？

如下图所示

/// importing package 
var natural = require('natural'); 
var classifier = new natural.BayesClassifier(); 



/// traning document 
classifier.addDocument("h", "greetings"); 
classifier.addDocument("hi", "greetings"); 
classifier.addDocument("hello", "greetings"); 
classifier.addDocument("data not working", "internet_problem"); 
classifier.addDocument("browser not working", "internet_problem"); 
classifier.addDocument("google not working", "internet_problem"); 
classifier.addDocument("facebook not working", "internet_problem"); 
classifier.addDocument("internet not working", "internet_problem"); 
classifier.addDocument("websites not opening", "internet_problem"); 
classifier.addDocument("apps not working", "internet_problem"); 
classifier.addDocument("call drops", "voice_problem"); 
classifier.addDocument("voice not clear", "voice_problem"); 
classifier.addDocument("call not connecting", "voice_problem"); 
classifier.addDocument("calls not going through", "voice_problem"); 
classifier.addDocument("disturbance", "voice_problem"); 
classifier.addDocument("bye", "close"); 
classifier.addDocument("thank you", "feedback_positive"); 
classifier.addDocument("thanks", "voice_problem"); 
classifier.addDocument("shit", "feedback_negeive"); 
classifier.addDocument("shit", "feedback_negeive"); 
classifier.addDocument("useless", "feedback_negetive"); 
classifier.addDocument("siebel testing", "siebel_testing") 


classifier.train(); 


/// running classification 
console.log('result for hi'); 
console.log(classifier.classify('hi')); 
console.log('result for hii'); 
console.log(classifier.classify('hii')); 
console.log('result for h'); 
console.log(classifier.classify('h'));

我使用NodeJs与NaturalNode library Natural Node GitHub project

问题

我训练我的文档和几个场景的帮助下创建我的应用程序输出
result for hi: 
greetings 


result for hii: 
internet_problem 

result for h: 
internet_problem 

正如你可以在重点工作hi值的结果看到的是未来正确的，但如果我拼错hi为hii或ih那么它给人一个错误的结果。我无法理解分类是如何工作的，我应该如何训练分类器，或者是否有办法找出分类结果是错误的，以便我可以再次请求用户输入。

任何帮助或解释或任何事情，高度赞赏。提前谢谢了。

请考虑我作为noob，并原谅任何错误。

来源

2016-12-28 Vikas Bansal

HII和IH从未被你的分类见过，所以除非natural.BayesClassifier进行输入的一些预处理，它不知道该怎么跟他们做，因此它们分类使用源自于prior probability各个班级标签的频率：internet_problem是您22个培训例子中最常见的标签。

编辑29/12/2016：作为评价所讨论的，也可以通过提示用户重新输入数据的量，分类置信度量低于给定的最小阈值下处理“坏”的分类：

const MIN_CONFIDENCE = 0.2; // Tune this 

var classLabel = null; 
do { 
    var userInput = getUserInput(); // Get user input somehow 
    var classifications = classifier.getClassifications(userInput); 
    var bestClassification = classifications[0]; 
    if (bestClassification["value"] < MIN_CONFIDENCE) { 
     // Re-prompt user in the next iteration 
    } else { 
     classLabel = bestClassification["label"]; 
    } 
} while (classLabel == null); 
// Do something with the label

来源

2016-12-28 10:27:01 errantlinguist

是否有任何方法可以确定分类是否给出了错误的结果，以便我可以要求用户重新输入声明。非常感谢您的洞察力 –

根据Natural Node的文档，您可以使用'console.log（classifier.getClassifications（'i long copper'））;'来访问分类器的置信度。如果你的预测依赖于你的班级的先验概率，那么它的置信度应该相对较低。 –

查看更新后的答案。置信度阈值。 – errantlinguist

NLP：给出错误结果的分类。如何发现NLP分类的结果是错误的？

回答

相关问题