scikit学习得到的分类/分分类的确定性已选定类别

我做一些多元文本分类，它为我的工作需要好：scikit学习得到的分类/分分类的确定性已选定类别

classifier = Pipeline([ 
    ('vect', CountVectorizer(tokenizer=my_tokenizer, stop_words=stopWords, ngram_range=(1, 2), min_df=2)), 
    ('tfidf', TfidfTransformer(norm='l2', use_idf=True, smooth_idf=True, sublinear_tf=False)), 
    ('clf', MultinomialNB(alpha=0.01, fit_prior=True))]) 

categories = [list of my possible categories] 

# Learning 

news = [list of news already categorized] 
news_cat = [the category of the corresponding news] 

news_target_cat = numpy.searchsorted(categories, news_cat) 

classifier = classifier.fit(news, news_target_cat) 

# Categorizing 

news = [list of news not yet categorized] 

predicted = classifier.predict(news) 

for i, pred_cat in enumerate(predicted): 
    print(news[i]) 
    print(categories[pred_cat])

现在，我想有预测类别是预测变量的“确定性”（例如：0.0 - >“我已经掷出骰子来选择一个类别”，高达1.0 - >“没有什么会改变我对新闻类别的看法”）。我应该如何获得该类别的确定性值/预测变量的分数？

来源

2016-03-04 Cabu

如果您需要类别probability之类的东西，您必须使用分类器的predict_proba()方法。

Docs。

来源

2016-03-04 11:50:55 solomkinmv

非常感谢！我没有在文档中看到它:-( – Cabu

scikit学习得到的分类/分分类的确定性已选定类别

回答

相关问题