2014-08-30 89 views
3
Data <- c("My name is Ernst.","I love chicken","Hello, my name is Stan!","Who?","I Love you!","Winner") 

函数应该添加一个“。”如果在句子结尾处没有任何这些符号[。?!]来结束句子。如何添加“。”在R条件下的字符串之后

我想在正则表达式的帮助下在R中建立一个函数,但我有一些问题只看字符串的结尾。

回答

3

以下gsub功能将在只有一句不与.?!符号结束句末加上一个点。

> Data <- c("My name is Ernst.","I love chicken","Hello, my name is Stan!","Who?","I Love you!","Winner") 
> gsub("^(?!.*[.?!]$)(.*)$", "\\1.", Data, perl=TRUE) 
[1] "My name is Ernst."  "I love chicken."   
[3] "Hello, my name is Stan!" "Who?"     
[5] "I Love you!"   "Winner." 

在正则表达式中,lookaheads用于条件检查目的。负向预测(?!.*[.?!]$)将在行结束处检查.?!的存在。如果它出现在最后,则它跳过句子,替换将不会发生在相应的行上。只有在最后没有.?!符号时才会进行替换。

OR

通过负回顾后,积极向前看,

> Data <- c("My name is Ernst.","I love chicken","Hello, my name is Stan!","Who?","I Love you!","Winner") 
> sub("(?<![!?.])(?=$)", ".", Data, perl=TRUE) 
[1] "My name is Ernst."  "I love chicken."   
[3] "Hello, my name is Stan!" "Who?"     
[5] "I Love you!"   "Winner." 
2

使用stringi

library(stringi) 
stri_replace_all_regex(Data, "(?<![^!?.])\\b$", ".") 
#[1] "My name is Ernst."  "I love chicken."   
#[3] "Hello, my name is Stan!" "Who?"     
#[5] "I Love you!"   "Winner." 
2

下面是一些可能的方案:

1)如果最后一个字符不是点,?要么 !然后用字符,然后点替换:

sub("([^.!?])$", "\\1.", Data) 

对于我们得到问题的数据:

[1] "My name is Ernst."  "I love chicken."   
[3] "Hello, my name is Stan!" "Who?"     
[5] "I Love you!"   "Winner." 

2)一个gsubfn解决方案更简单。如果最后一个字符不是一个点,它用一个点代替空()。要么 ? 。

library(gsubfn) 
gsubfn("[^.!?]()$", ".", Data) 

3)这一个使用grepl。如果点,!要么 ?是最后一个字符,然后附加空字符串,否则附加点。在所有

paste0(Data, ifelse(grepl("[.!?]$", Data), "", ".")) 

4)这一个不使用正则表达式。它摘下最后一个字符,如果它是一个点,!要么 ?它附加空字符串,否则附加点:

paste0(Data, ifelse(substring(Data, nchar(Data)) %in% c(".", "!", "?"), "", ".")) 
2

这是另一种解决方案。

x <- c('My name is Ernst.', 'I love chicken', 
     'Hello, my name is Stan!', 'Who?', 'I Love you!', 'Winner') 
r <- sub('[^?!.]\\K$', '.', x, perl=T) 
## [1] "My name is Ernst."  "I love chicken."   
## [3] "Hello, my name is Stan!" "Who?"     
## [5] "I Love you!"   "Winner."