为什么这个消极看起来后面错了？

def get_hashtags(post) 
    tags = [] 
    post.scan(/(?<![0-9a-zA-Z])(#+)([a-zA-Z]+)/){|x,y| tags << y} 
    tags 
end

Test.assert_equals(get_hashtags("two hashs##in middle of word#"), []) 
#Expected: [], instead got: ["in"]

如果它不看后面，看是否匹配犯规用一个词或数字开头？为什么它仍然接受'in'作为有效的匹配？为什么这个消极看起来后面错了？

来源

2015-10-19 Chris

因为该模式成功的第二个＃（这不是前面有'[0-9a-zA-Z]'）。 –

您应该使用\K而不是负向倒序。这使您可以大大简化您的正则表达式：不需要预定义数组，捕获组或块。

\K表示“丢弃目前为止所有匹配的东西”。这里的关键是，可变长度匹配之前可以\K，而（在Ruby和大多数其它语言）可变长度的匹配没有在（负或正）lookbehinds允许的。如果我不写在扩展模式正则表达式

r =/
    [^0-9a-zA-Z#] # do not match any character in the character class 
    \#+   # match one or more pound signs 
    \K   # discard everything matched so far 
    [a-zA-Z]+  # match one or more letters 
    /x   # extended mode

注意#在\#+不需要逃脱。

"two hashs##in middle of word#".scan r 
    #=> [] 

"two hashs&#in middle of word#".scan r 
    #=> ["in"] 

"two hashs#in middle of word&#abc of another word.###def ".scan r 
    #=> ["abc", "def"]

来源

2015-10-19 05:49:22

我一直在寻找这个解决方案，现在已经很久了。谢了哥们。 – UsamaMan

为什么这个消极看起来后面错了？

回答

相关问题