2009-08-01 207 views
7

我想找到一种方法让我从一个字符串(从数据库中取出)动态创建一个正则表达式对象,然后用它来过滤另一个字符串。这个例子是从git提交消息中提取数据,但理论上任何有效的正则表达式都可以作为字符串出现在数据库中。Ruby中的动态正则表达式

会发生什么

>> string = "[ALERT] Project: Revision ...123456 committed by Me <[email protected]>\n on 2009- 07-28 21:21:47\n\n Fixed typo\n" 
>> r = Regexp.new("[A-Za-z]+: Revision ...[\w]+ committed by [A-Za-z\s]+") 
>> string[r] 
=> nil 

我希望发生

>> string = "[ALERT] Project: Revision ...123456 committed by Me <[email protected]>\n on 2009- 07-28 21:21:47\n\n Fixed typo\n" 
>> string[/[A-Za-z]+: Revision ...[\w]+ committed by [A-Za-z\s]+/] 
=> "Project: Revision 123456 committed by Me" 

回答

11

你唯一缺少的一件事是什么:

>> Regexp.new "\w" 
=> /w/ 
>> Regexp.new "\\w" 
=> /\w/ 

反斜杠在字符串转义字符。如果你想要一个文字反斜杠,你必须加倍。

>> string = "[ALERT] Project: Revision ...123456 committed by Me <[email protected]>\n on 2009- 07-28 21:21:47\n\n Fixed typo\n" 
=> "[ALERT] Project: Revision ...123456 committed by Me <[email protected]>\n on 2009- 07-28 21:21:47\n\n Fixed typo\n" 
>> r = Regexp.new("[A-Za-z]+: Revision ...[\\w]+ committed by [A-Za-z\\s]+") 
=> /[A-Za-z]+: Revision ...[\w]+ committed by [A-Za-z\s]+/ 
>> string[r] 
=> "Project: Revision ...123456 committed by Me " 

通常情况下,如果你愿意粘贴从“破”行,而不仅仅是输入输出,你可能看准了ws不正确转义

+0

完美,感谢 - 我知道我必须做一些微妙的错误。 – davidsmalley 2009-08-01 08:13:57

0

选项1:

# Escape the slashes: 
r = Regexp.new("[A-Za-z]+: Revision ...[\\w]+ committed by [A-Za-z\\s]+") 

缺点:手动逸出所有已知的转义字符

选项2:

# Use slashes in constructor 
r = Regexp.new(/[A-Za-z]+: Revision ...[\w]+ committed by [A-Za-z\s]+/) 

缺点:无

+0

对于选项2 - 构造函数的参数总是字符串,因为正在从数据库中提取正则表达式,所以在这种情况下不起作用。 – davidsmalley 2009-08-01 08:16:58