tidytext示例使用管道过滤器错误

尝试重现http://tidytextmining.com/twitter.html中的示例时出现问题。tidytext示例使用管道过滤器错误

基本上我想，以保持stop_Word代码

library(tidytext) 
library(stringr) 

reg <- "([^A-Za-z_\\d#@']|'(?![A-Za-z_\\d#@]))" 

tidy_tweets <- tweets %>% 
    mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|http://[A-Za-z\\d]+|&amp;|&lt;|&gt;|RT", "")) %>% 
    unnest_tokens(word, text, token = "regex", pattern = reg) %>% 
    filter(!word %in% stop_words$word, 
     str_detect(word, "[a-z]"))

，这部分包括适应的tweets数据帧。

所以我想这：

tidy_tweets <- tweets %>% 
    mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|http://[A-Za-z\\d]+|&amp;|&lt;|&gt;|RT", "")) %>% 
    unnest_tokens(word, text, token = "regex", pattern = reg) 

tidy_tweets_sw <- filter(!word %in% stop_words$word, str_detect(tidy_tweets, "[a-z]"))

但没有奏效，因为我得到了以下错误消息：

Error in match(x, table, nomatch = 0L) : 
'match' requires vector arguments

我曾试图通过的两个输入向量版本匹配，但无济于事。有没有人有更好的主意？

来源

2016-11-16 Oki

tidytext通常采用' anti_join（stop_words）'在小插曲中。 – alistaire

不确定，但是我认为你的问题是在这里：

tidy_tweets_sw <- filter(!word %in% stop_words$word, str_detect(tidy_tweets, "[a-z]"))

filter没有关于要在所有过滤什么线索，这应该工作：

tidy_tweets_sw <- tidy_tweets %>% filter(!word %in% stop_words$word, str_detect(tidy_tweets, "[a-z]"))

来源

2016-11-16 15:52:42 Tensibai

完美！非常感谢（“我应该知道”）！ – Oki

好吧，这就是我认为的管道问题，很容易忘记最左边的arg是第一个到右边的任何功能:) – Tensibai

'tweets'应改为'tidy_tweets'来反映Oki的中间步骤 –

您需要将filter语句中的数据作为第一个参数。

tidy_tweets <- tweets %>% 
    mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|http://[A-Za-z\\d]+|&amp;|&lt;|&gt;|RT", "")) %>% 
    unnest_tokens(word, text, token = "regex", pattern = reg) 

tidy_tweets_sw <- filter(tidy_tweets, !(word %in% stop_words$word), str_detect(tidy_tweets, "[a-z]"))

来源

2016-11-16 15:46:57

tidytext示例使用管道过滤器错误

回答

相关问题