Python3文本标签

我不知道，从哪里开始这个问题，因为我现在学习神经网络。我有一个带有句子>标签对的大数据库。例如：Python3文本标签

i want take a photo < photo 
i go to take a photo < photo 
i go to use my camera < photo 
i go to eat something < eat 
i like my food < eat

如果用户写新句，我要检查所有的标签accurancy评分：

“我上床睡觉后，我用我的相机” <照片：0.9000，吃：0.4000 ，...

所以这个问题，我可以从哪里开始？ Tensorflow和scikit学习的是看起来不错，但这个文件classificationt不显示精度：\

来源

2017-02-28 esemve

import numpy as np 
from sklearn.linear_model import LogisticRegression 
from sklearn.feature_extraction.text import TfidfVectorizer 
from sklearn.preprocessing import LabelEncoder 
from sklearn import metrics 

sentences = ["i want take a photo", "i go to take a photo", "i go to use my camera", "i go to eat something", "i like my food"] 

labels = ["photo", "photo", "photo", "eat", "eat"] 

tfv = TfidfVectorizer() 

# Fit TFIDF 
tfv.fit(traindata) 
X = tfv.transform(traindata) 

lbl = LabelEncoder() 
y = lbl.fit_transform(labels) 

xtrain, xtest, ytrain, ytest = cross_validation.train_test_split(X, y, stratify=y, random_state=42) 

clf = LogisitcRegression() 
clf.fit(xtrain, ytrain) 
predictions = clf.predict(xtest) 

print "Accuracy Score = ", metrics.accuracy_score(ytest, predictions)

新的数据：

new_sentence = ["this is a new sentence"] 
X_Test = tfv.transform(new_sentence) 
print clf.predict_proba(X_Test)

来源

2017-02-28 12:14:51

？好的，但我如何检查所有标签的新随机句子？ – esemve

查看最新的答案 –

Thx很多，但是我的最后一个问题是：这是工作，但是如果我搜索测试现有句子，例如：“我去吃东西”，它回答：0.55 0.44，但是为什么？它的一个列车数据为吃饭类别：\第一个数字不是照片，第二个是吃饭类别？或者，如果不是，我可以得到什么数字是什么类别？ – esemve

Python3文本标签

回答

相关问题