2014-12-08 462 views
2

我试图用推文和极性建立模型 但是在中间我得到这个奇怪的错误: 在这一行:错误`row.names < - 。data.frame`(`* tmp *`,value = c(NA_real_,NA_real_

analytics <- create_analytics(container, MAXENT_CLASSIFY) 

我得到这个

Error in `row.names<-.data.frame`(`*tmp*`, value = c(NA_real_, NA_real_, : 
    duplicate 'row.names' are not allowed 
In addition: Warning messages: 
1: In cbind(labels, BEST_LABEL = as.numeric(best_labels), BEST_PROB = best_probs, : 
    NAs introduced by coercion 
2: In create_documentSummary(container, score_summary) : 
    NAs introduced by coercion 
3: In cbind(MANUAL_CODE = testing_codes, CONSENSUS_CODE = scores$BEST_LABEL, : 
    NAs introduced by coercion 
4: In create_topicSummary(container, score_summary) : 
    NAs introduced by coercion 
5: In cbind(TOPIC_CODE = as.numeric(as.vector(topic_codes)), NUM_MANUALLY_CODED = manually_coded, : 
    NAs introduced by coercion 
6: In cbind(labels, BEST_LABEL = as.numeric(best_labels), BEST_PROB = best_probs, : 
    NAs introduced by coercion 
7: non-unique values when setting 'row.names': 

我的CSV文件看起来像:

text, polarity 
Hello I forget the password of my credit card need to know how I can make my statement, neutral 
can provide the swift code thanks, neutral 
thanks just one more doubt has this card commissions with these characteristics, neutral 
Thanks, neutral 
are arriving mail scam, negative 
can you help me I need to pay an online purchase and ask me for a terminal my debit which is, neutral 
if I do not win anything this time I change banks, negative 
you can be the next winner of the million that circumvents account award date January, neutral 
account and see my accounts so I can have the, negative 
thanks i just send the greetings consultation, neutral 
may someday enable office not sick people, negative 
hello is running payments through the online banking no, negative 
thanks hope they do, neutral 
should pay attention to many happened to us that your system flushed insufficient balance or had no money in the accounts, negative 
yesterday someone had the dignity to answer the telephone banking and verify that the system is crap, negative 
and tried but apparently the problem is just to pay movistar services, neutral 
good morning was trying to pay for services through the website but get error retry in minutes, negative 
if no system agent is non clients or customers also, positive 

我使用的代码是:

library(RTextTools) 

pg <- read.csv("cleened_tweets.csv", header=TRUE, row.names=NULL) 

head(pg) 

pgT <- as.factor(pg$text) 

pgP <- as.factor(pg$polarity) 

doc_matrix <- create_matrix(pgT, language="spanish", removeNumbers=TRUE, stemWords=TRUE, removeSparseTerms=.998) 

dim(doc_matrix) 

container <- create_container(doc_matrix, pgP, trainSize=1:275, testSize=276:375, virgin=FALSE) 

MAXENT <- train_model(container,"MAXENT") 

MAXENT_CLASSIFY <- classify_model(container, MAXENT) 

analytics <- create_analytics(container, MAXENT_CLASSIFY) 

summary(analytics) 
+0

请清楚指出您的代码的哪一行导致错误。 “在中间”并不够具体。另外,显示哪些行会导致哪些警告。 – Roland 2014-12-08 08:03:16

+0

@Roland我编辑了我的问题以显示发生错误的位置 – user3827298 2014-12-08 16:42:43

回答

0

从as.factor到as.numeric转换你的PGP变量。这应该重新解决问题

pgP <- as.numeric(as.factor(pg$polarity)) 
1

我也遇到过RTextTools这个错误。 create_analytics函数不能处理因子变量或字符串 - 仅限数字标签。我通常只是在运行此代码后将文本标签合并到最后。

相关问题