0
我一直在处理情绪数据集,发现bing和nrc数据集包含几个词,既有积极的一面,也有消极的一面。带有正面和负面情绪的tidytext单词
**兵 - 三个字正面和负面情绪**
env_test_bing_raw <- get_sentiments("bing") %>%
filter(word %in% c("envious", "enviously","enviousness"))
# A tibble: 6 x 2
word sentiment
<chr> <chr>
1 envious positive
2 envious negative
3 enviously positive
4 enviously negative
5 enviousness positive
6 enviousness negative
** NRC - 81个字与正面和负面的情绪**
test_nrc <- as.data.frame(
get_sentiments("nrc") %>%
filter(sentiment %in% c("positive","negative")) %>%
group_by(word) %>%
summarize(count = n()) %>%
filter(count > 1))
env_test_nrc <- get_sentiments("nrc") %>%
filter(sentiment %in% c("positive","negative")) %>%
filter(word %in% test_nrc$word)
# A tibble: 162 x 2
word sentiment
<chr> <chr>
1 abundance negative
2 abundance positive
3 armed negative
4 armed positive
5 balm negative
6 balm positive
7 boast negative
8 boast positive
9 boisterous negative
10 boisterous positive
# ... with 152 more rows
我很好奇,如果我有做错了什么,或者一个单词在单个源数据集中如何既有负面情绪也有正面情绪。处理这些情况的标准做法是什么?
谢谢!