蟒蛇 - 如何将numpy数组添加到熊猫数据框中

我已经训练了Logistic回归分类器来预测评论是正面还是负面。现在，我想将predict_proba函数返回的预测概率附加到包含评论的Pandas数据框中。我试图做类似：蟒蛇 - 如何将numpy数组添加到熊猫数据框中

test_data['prediction'] = sentiment_model.predict_proba(test_matrix)

显然，这是不行的，因为predict_proba回报2D-numpy的阵列。那么，做这件事最有效的方法是什么？我创建test_matrix与SciKit-学习的CountVectorizer：

vectorizer = CountVectorizer(token_pattern=r'\b\w+\b') 
train_matrix = vectorizer.fit_transform(train_data['review_clean'].values.astype('U')) 
test_matrix = vectorizer.transform(test_data['review_clean'].values.astype('U'))

的样本数据是这样的：

| Review          | Prediction   |      
| ------------------------------------------ | ------------------ | 
| "Toy was great! Our six-year old loved it!"| 0.986   |

来源

2017-02-18 DBE7

你能提供样本数据集（5 - 7行）？ – MaxU

相关问题：http://stackoverflow.com/questions/41904197/data-frame-of-tfidf-with-python – MaxU

将预测分配给一个变量，然后从要分配给熊猫数据框cols的变量中提取列。如果'x'是2D numpy的阵列的预测， 'X = sentiment_model.predict_proba（test_matrix）' 然后可以做， 'TEST_DATA [ 'prediction0'] = X [：，0]'和 'test_data ['prediction1'] = x [：，1]' –

分配预测到一个变量，然后从变量中提取的列被分配到大熊猫数据框的cols 。如果x是2D numpy的与预测阵列，

x = sentiment_model.predict_proba(test_matrix)

然后可以做，

test_data['prediction0'] = x[:,0] 
test_data['prediction1'] = x[:,1]

来源

2017-02-18 12:50:41

非常有帮助 – suku

蟒蛇 - 如何将numpy数组添加到熊猫数据框中

回答

相关问题