如何使正则表达式只捕获命名组

根据Regex文档，使用RegexOptions.ExplicitCapture使得正则表达式只匹配命名组，如(?<groupName>...);但在行动上它做了一些有点不同的事情。如何使正则表达式只捕获命名组

考虑的几行代码：

static void Main(string[] args) { 
    Regex r = new Regex(
     @"(?<code>^(?<l1>[\d]{2})/(?<l2>[\d]{3})/(?<l3>[\d]{2})$|^(?<l1>[\d]{2})/(?<l2>[\d]{3})$|(?<l1>^[\d]{2}$))" 
     , RegexOptions.ExplicitCapture 
    ); 
    var x = r.Match("32/123/03"); 
    r.GetGroupNames().ToList().ForEach(gn => { 
     Console.WriteLine("GroupName:{0,5} --> Value: {1}", gn, x.Groups[gn].Success ? x.Groups[gn].Value : ""); 
    }); 
}

当你运行这段代码，你会看到的结果中包含一个名为组，而我没有在我的正则表达式命名0组！

GroupName: 0 --> Value: 32/123/03 
GroupName: code --> Value: 32/123/03 
GroupName: l1 --> Value: 32 
GroupName: l2 --> Value: 123 
GroupName: l3 --> Value: 03 
Press any key to continue . . .

请问有人请向我解释这种行为？

来源

2015-04-01 Achilles

的*零组*匹配整个正则表达式 – 2015-04-01 18:56:43

@AlexK。你的意思是我不得不忽视第一组？ – Achilles 2015-04-01 19:07:38

您总是有组0：这是整场比赛。基于定义组的开括号的序数位置，编号组相对于1。正则表达式（格式为清楚起见）：

(?<code> 
^
    (?<l1> [\d]{2}) 
/
    (?<l2> [\d]{3}) 
/
    (?<l3> [\d]{2}) 
    $ 
| 
^
    (?<l1>[\d]{2}) 
/
    (?<l2>[\d]{3}) 
    $ 
| 
    (?<l1> ^[\d]{2} $) 
)

你的表达会原路返回，所以你可能会考虑简化您的正则表达式。这可能是更清晰，更高效：

static Regex rxCode = new Regex(@" 
^     # match start-of-line, followed by 
    (?<code>    # a mandatory group ('code'), consisting of 
    (?<g1> \d\d)  # - 2 decimal digits ('g1'), followed by 
    (     # - an optional group, consisting of 
    /    # - a literal '/', followed by 
     (?<g2> \d\d\d) # - 3 decimal digits ('g2'), followed by 
     (    # - an optional group, consisting of 
     /   #  - a literal '/', followed by 
     (?<g3> \d\d) #  - 2 decimal digits ('g3') 
    )?    #  - END: optional group 
    )?     # - END: optional group 
)     # - END: named group ('code'), followed by 
    $     # - end-of-line 
" , RegexOptions.IgnorePatternWhitespace|RegexOptions.ExplicitCapture);

一旦你的，这样的事情：

string[] texts = { "12" , "12/345" , "12/345/67" , } ; 

foreach (string text in texts) 
{ 
    Match m = rxCode.Match(text) ; 
    Console.WriteLine("{0}: match was {1}" , text , m.Success ? "successful" : "NOT successful") ; 
    if (m.Success) 
    { 
    Console.WriteLine(" code: {0}" , m.Groups["code"].Value) ; 
    Console.WriteLine(" g1: {0}" , m.Groups["g1"].Value) ; 
    Console.WriteLine(" g2: {0}" , m.Groups["g2"].Value) ; 
    Console.WriteLine(" g3: {0}" , m.Groups["g3"].Value) ; 
    } 
}

产生预期

12: match was successful 
    code: 12 
    g1: 12 
    g2: 
    g3: 
12/345: match was successful 
    code: 12/345 
    g1: 12 
    g2: 345 
    g3: 
12/345/67: match was successful 
    code: 12/345/67 
    g1: 12 
    g2: 345 
    g3: 67

来源

2015-04-01 19:19:25

+1，并感谢我的正则表达式的更清洁版本。我知道它可以用更清晰的方式表达出来，但是因为它起作用，而且我很懒，所以我就这么保持它！我会很好的使用你的regex版本，我会忽略** 0 **组。 – Achilles 2015-04-01 19:26:47

命名组

^(?<l1>[\d]{2})/(?<l2>[\d]{3})/(?<l3>[\d]{2})$|^(?<l1>[\d]{2})/(?<l2>[\d]{3})$|(?<l1>^[\d]{2}$)

enter image description here

试试这个（我从你的正则表达式中删除第一组） - see demo

来源

2015-04-01 18:55:11 GRUNGER

它仍然是一样的。 ** 0 **组在那里;并在一个侧面说明，我需要'code'组被捕获。 GroupName：0 - > Value：32/123/03 GroupName：l1 - > Value：32 GroupName：l2 - > Value：123 GroupName：l3 - > Value：03 按任意键继续。。。 – Achilles 2015-04-01 19:00:59

用于文本“123”的模式“\ d +” - 具有1组= 123的返回数组。用于文本“123”的模式“（\ d +）” - 具有2组= 123和123的返回数组。模式“ “ \ d +）”为文本“123” - 返回数组与2组123也是123. 我认为这是应该的。 – GRUNGER 2015-04-01 19:08:56

'\ d +'未被命名。 '（？ \ d +）'将被命名并且这相同。我认为'GetGroupNames（）'方法和'RegexOptions.ExplicitCapture'的解释有问题。 – Achilles 2015-04-01 19:12:58

如何使正则表达式只捕获命名组

回答

相关问题