我正在尝试实施Twitter情绪分析。我需要获取所有积极的推文和消极推文,并将它们存储在特定的文本文件中。PIG:Twitter情绪分析
sample.json
{"id": 252479809098223616, "created_at": "Wed Apr 12 08:23:20 +0000 2016", "text": "google is a good company", "user_id": 450990391}{"id": 252479809098223616, "created_at": "Wed Apr 12 08:23:20 +0000 2016", "text": "facebook is a bad company","user_id": 450990391}
dictionary.text让所有的正面和negetive单词列表
weaksubj 1 bad adj n negative
strongsubj 1 good adj n positive
猪脚本: -
tweets = load 'new.json' using JsonLoader('id:chararray,text:chararray,user_id:chararray,created_at:chararray');
dictionary = load 'dictionary.text' AS (type:chararray,length:chararray,word:chararray,pos:chararray,stemmed:chararray,polarity:chararray);
words = foreach tweets generate FLATTEN(TOKENIZE(text)) AS word,id,text,user_id,created_at;
sentiment = join words by word left outer, dictionary by word;
senti2 = foreach sentiment generate words::id as id,words::created_at as created_at,words::text as text,words::user_id as user_id,dictionary::polarity as polarity;
res = FILTER senti2 BY polarity MATCHES '.*possitive.*';
描述RES: -
res: {id: chararray,created_at: chararray,text: chararray,user_id: chararray,polarity: chararray}
但是,当我倾倒RES我没有看到任何输出,但它没有任何错误执行罚款。
我在这里做了什么错误。
请给我建议。
Mohan.V
感谢您的回复@Sandesh。 – Bunny
我想什么ü建议。 但仍然,它的运行成功,但没有输出。 – Bunny
我已经去掉空格编辑字典文件。 – Bunny