使用正则表达式查找所有匹配 - 贪婪和非贪婪！

请采取以下字符串：“互联网上的营销和板球”。使用正则表达式查找所有匹配 - 贪婪和非贪婪！

我想使用正则表达式找到“Ma”-any text-“et”的所有可能匹配项。所以..

市场
营销和板球互联网

正则表达式Ma.*et回报 “在互联网上营销和Cricket” 关于

营销和板球。正则表达式Ma.*?et返回市场。但我想要一个返回所有3的正则表达式。这可能吗？

谢谢。

来源

2010-11-03 Rastaboy

恩，你真的需要正则表达式吗？ – Gumbo 2010-11-03 21:03:52

LEPL是一个用于Python的解析库，它具有“产生”所有可能匹配的正则表达式。 – delnan 2010-11-03 21:07:13

据我所知：

不可以，但你可以匹配非贪婪，然后再生成一个量词一个新的正则表达式来获得的第二场比赛。像这样：

Ma.*?et 
Ma.{3,}?et

...等等...

来源

2010-11-03 21:08:09 thejh

不幸的是，这是不可能与标准的POSIX正则表达式，它返回一个（最好的候选人，每正则表达式规则）匹配。假设您正在程序中使用它，您将需要利用扩展功能，该功能可能以您使用此正则表达式的特定编程语言存在，以完成此任务。

来源

2010-11-03 21:04:53

感谢球员，真正帮助。以下是我想出了PHP：

function preg_match_ubergreedy($regex,$text) { 

    for($i=0;$i<strlen($text);$i++) { 
     $exp = str_replace("*","{".$i."}",$regex); 
     preg_match($exp,$text,$matches); 
     if($matches[0]) { 
      $matched[] = $matches[0]; 
     } 
    } 

    return $matched; 

} 
$text = "Marketing and Cricket on the Internet"; 
$matches = preg_match_ubergreedy("@Ma.*[email protected]",$text);

来源

2010-11-03 21:32:52 Rastaboy

对于一个更一般的正则表达式，另一种选择是递归对阵以前匹配的贪婪正则表达式，反过来丢弃第一和最后一个字符，以确保你只匹配上一场比赛的一个子串。匹配Marketing and Cricket on the Internet后，我们测试了子域匹配arketing and Cricket on the Internet和Marketing and Cricket on the Interne。

它去在C＃这样的事情...

public static IEnumerable<Match> SubMatches(Regex r, string input) 
{ 
    var result = new List<Match>(); 

    var matches = r.Matches(input); 
    foreach (Match m in matches) 
    { 
     result.Add(m); 

     if (m.Value.Length > 1) 
     { 
      string prefix = m.Value.Substring(0, m.Value.Length - 1); 
      result.AddRange(SubMatches(r, prefix)); 

      string suffix = m.Value.Substring(1); 
      result.AddRange(SubMatches(r, suffix)); 
     } 

    } 

    return result; 
}

这个版本可以，但是，最终多次返回相同的子匹配，例如，它会发现Marmoset两次Marketing and Marmosets on the Internet，首先作为一个子匹配的Marketing and Marmosets on the Internet，然后作为Marmosets on the Internet的子匹配。

来源

2010-11-03 22:24:29 stevemegson

使用正则表达式查找所有匹配 - 贪婪和非贪婪！

回答

相关问题