2016-12-01 89 views
0

我试图从R中的句子中提取精确的短语。它也提取了其部分匹配的句子。例如:无法从R中的句子中提取精确的短语

phrase <- c("r is not working","roster is not working") 
    sentence <- c("ABC is not working and roster is not working","CDE is working but printer is not working") 

    extract <- sapply(phrase, grepl, x = sentence) 
    extract 

它使输出为:

   r is not working  roster is not working 
    [1,]    TRUE     TRUE 
    [2,]    TRUE     FALSE 

我所需的输出是:

   r is not working  roster is not working 
    [1,]    FALSE     TRUE 
    [2,]    FALSE     FALSE 

短语 “R不工作” 不应该匹配两个句子。有什么办法可以解决这个问题吗?有什么想法吗?谢谢!!

+0

可能会添加字边界,如'sapply(paste0(“\\ b”,短语,“\\ b”),grepl,x =句子)' –

+0

“r不工作”匹配两个字符串,但添加一个空格在r:“r不工作”之前将阻止匹配。 – Dave2e

回答

1

grepl评估正则表达式。

如果你想坚持的,您的搜索模式以字符串的开始和结束:

phrase <- c("^r is not working$", "^roster is not working$") 

如果你不是要检查精确匹配,简单地使用

extract <- sapply(sentence, `%in%`, phrase)