Java正则表达式匹配模式

我需要检查针对某些文本的模式（我必须检查我的模式是否在很多文本中）。Java正则表达式匹配模式

这是我的例子

String pattern = "^[a-zA-Z ]*toto win(\\W)*[a-zA-Z ]*$";  
if("toto win because of".matches(pattern)) 
System.out.println("we have a winner"); 
else 
System.out.println("we DON'T have a winner");

对于我的测试，该模式必须匹配，但使用正则表达式我不匹配。必须匹配：

" toto win bla bla" 

"toto win because of" 
"toto win. bla bla" 


"here. toto win. bla bla" 
"here? toto win. bla bla" 

"here %dfddfd . toto win. bla bla"

必须不匹配：

" -toto win bla bla" 
" pretoto win bla bla"

我尝试使用我的正则表达式来做到这一点，但它不工作。

你能指点我做错了什么吗？

来源

2012-06-12 CC.

引号是否会出现在输入字符串中？ – Cylian

它可以是任何东西。这是一个普通的文本 –

请[不要添加签名和标语到您的帖子]（http://stackoverflow.com/faq#signatures）。你也经常拼错“很多”。 “a”和“lot”之间有一个空格。 – meagar

这会工作

(?im)^[?.\s%a-z]*?\btoto win\b.+$

说明

"(?im)" +   // Match the remainder of the regex with the options: case insensitive (i);^and $ match at line breaks (m) 
"^" +    // Assert position at the beginning of a line (at beginning of the string or after a line break character) 
"[?.\\s%a-z]" + // Match a single character present in the list below 
        // One of the characters “?.” 
        // A whitespace character (spaces, tabs, and line breaks) 
        // The character “%” 
        // A character in the range between “a” and “z” 
    "*?" +   // Between zero and unlimited times, as few times as possible, expanding as needed (lazy) 
"\\b" +   // Assert position at a word boundary 
"toto\\ win" +  // Match the characters “toto win” literally 
"\\b" +   // Assert position at a word boundary 
"." +    // Match any single character that is not a line break character 
    "+" +    // Between one and unlimited times, as many times as possible, giving back as needed (greedy) 
"$"    // Assert position at the end of a line (at the end of the string or before a line break character)

更新1

(?im)^[?~`'[email protected]#$%^&*+.\s%a-z]*? toto win\b.*$

UPDATE 2

(?im)^[^-]*?\btoto win\b.*$

UPDATE 3

(?im)^.*?(?<!-)toto win\b.*$

说明

"(?im)" +  // Match the remainder of the regex with the options: case insensitive (i);^and $ match at line breaks (m) 
"^" +   // Assert position at the beginning of a line (at beginning of the string or after a line break character) 
"." +   // Match any single character that is not a line break character 
    "*?" +   // Between zero and unlimited times, as few times as possible, expanding as needed (lazy) 
"(?<!" +  // Assert that it is impossible to match the regex below with the match ending at this position (negative lookbehind) 
    "-" +   // Match the character “-” literally 
")" + 
"toto\\ win" + // Match the characters “toto win” literally 
"\\b" +   // Assert position at a word boundary 
"." +   // Match any single character that is not a line break character 
    "*" +   // Between zero and unlimited times, as many times as possible, giving back as needed (greedy) 
"$"    // Assert position at the end of a line (at the end of the string or before a line break character)

正则表达式需要ESCA用于代码内使用

来源

2012-06-12 09:44:48 Cylian

此字符串不匹配：“here！toto win dfddfd” –

其实可以有任何字符。想象一下网站上的文字。我们可以有任何东西。除了“blatoto win”或“-toto win”之外，我还没有一些文字/字符（除了“ - ”）。 –

太好了。它做我想要的。非常感谢。 –

你缺少win和下一个单词之间的空格在您的模式

试试这个：\\stoto\\swin\\s\\w

http://gskinner.com/RegExr/在这里你可以尽你的正则表达式

来源

2012-06-12 08:58:11 dantuch

你的意思是我必须有String pattern =“（\\ s）* toto win（\\ s）*（\\ W）*”; \t？ –

@CC。看到我的编辑 – dantuch

@CC，对不起，现在它应该可以正常工作。 – dantuch

下面的正则表达式

^[a-zA-Z. ]*toto win[a-zA-Z. ]*$

威尔匹配

toto win bla bla 
toto win because of 
toto win. bla bla

而且不匹配

-toto win bla bla"

来源

2012-06-12 09:00:01 buckley

这似乎很棒，但像“toto win。bla bla”这样的字符串不起作用。有任何想法吗？ –

更新了我的答案。在你的问题中，你提到了“特殊”字符。我补充了一点。通过将其添加到角色类别中来考虑您认为特别的东西。你看到了吗？根据需要添加。 – buckley

我明白了。我刚刚更新了我的问题。仍然不完全工作。我不知道如何在我的模式之前没有性格。 –

只要改变你的代码String pattern = "\\s*toto win[\\w\\s]*";

\ W意味着没有文字字符，\ w表示单词字符（A-ZA-Z_0-9）。

[\\w\\s]*将匹配“toto win”后的任意数量的单词和空格。

UPDATE

，以反映新的要求，这表达式将工作：

"((.*\\s)+|^)toto win[\\w\\s\\p{Punct}]*"

((.*\\s)+|^)比赛无论是什么，然后至少一个空号或行的开始。

[\\w\\s\\p{Punct}]*匹配单词，数字，空格和标点符号的任意组合。

来源

2012-06-12 09:07:44 Keppil

如果您包含实际要求，而不是要匹配的东西列表，那么它会更容易。我有一个强烈的怀疑“toto winabc”不应该匹配，但我不确定，因为你没有包括这样的例子或解释的要求。无论如何，这适用于您当前的所有示例：

static String[] matchThese = new String[] { 
     " toto win bla bla", 
     "toto win because of", 
     "toto win. bla bla", 
     "here. toto win. bla bla", 
     "here? toto win. bla bla", 
     "here %dfddfd . toto win. bla bla" 
}; 

static String[] dontMatchThese = new String[] { 
     " -toto win bla bla", 
     " pretoto win bla bla" 
}; 


public static void main(String[] args) { 
    // either beginning of a line or whitespace followed by "toto win" 
    Pattern p = Pattern.compile("(^|\\s)toto win"); 

    System.out.println("Should match:"); 
    for (String s : matchThese) { 
     System.out.println(p.matcher(s).find()); 
    } 

    System.out.println("Shouldn't match:"); 
    for (String s : dontMatchThese) { 
     System.out.println(p.matcher(s).find()); 
    } 
}

来源

2012-06-12 10:56:07

我举例说明了应该匹配哪种文本。文本可以是任何东西，所以我不能使用你的方法。不管怎么说，还是要谢谢你。 –

Java正则表达式匹配模式

回答

相关问题