2017-09-14 123 views
1

我有这个正则表达式模式,我试图找出一个句子(字符串)是否与它匹配。C#正则表达式 - 从可重复组中获取值

我的模式:

@"^A\s(?<TERM1>[A-Z][a-z]{1,})\sconsists\sof\s((?<MINIMUM1>(\d+))\sto\s(?<MAXIMUM1>(\d+|many){1})|(?<MINMAX1>(\d+|many{1}){1}){1})\s(?<TERM2>[A-Z][a-z]{1,})(\sand\s((?#********RepeatablePart********)(?<MININUM2>(\d+))\sto\s(?<MAXIMUM2>(\d+|many){1})|(?<MINMAX2>(\d+|many{1}){1}){1})\s(?<TERM3>([A-Z][a-z]{1,})))+\.$" 

如何阅读我的模式:

A (TERM1) consists of (MINIMUM1 to (MAXIMUM1|many)|(MINMAX1|many)) (TERM2) ((?#********RepeatablePart********)and (MINIMUM2 to (MAXIMUM2|many)|(MINMAX|many)) (TERM3))+. 

MINMAX1/MINMAX2可以是数字,或只是字 '多' 和MINIMUM1/MINIMUM2是一个数字, MAXIMUM1/MAXIMUM2可能是一个数字或'many'这个词。

范例语句:

  1. 轿厢由2至5座位和1 Breakpedal和1 Gaspedal和4至6级的Windows。
  2. 一棵树由许多苹果和2到多种颜色和0到1松鼠和许多树叶组成。
  3. 一本书由1到很多作者和1个标题和3个书签组成。

    1. 将包含:TERM1 =汽车,MINIMUM1 = 2,MAXIMUM1 = 5,MINMAX1 = NULL,TERM2 =座椅,MINIMUM2 = NULL,MAXIMUM2 = NULL,MINMAX2 = 1,TERM3 = Breakpedal,MINIMUM2 = NULL,MAXIMUM2 = null,MINMAX2 = 1,TERM3 = Gaspedal,MINIMUM2 = 4,MAXIMUM2 = 6,MINMAX2 = null,TERM3 = Windows
    2. 将包含:TERM1 = Tree,MINIMUM1 = null,MAXIMUM1 = null,MINMAX1 = many,TERM2 =苹果,MINIMUM2 = 2,MAXIMUM2 =许多,MINMAX2 =空,TERM3 =颜色,MINIMUM2 = 0,MAXIMUM2 = 1,MINMAX2 =空,TERM3 =松鼠,MINIMUM2 =空,MAXIMUM2 =空,MINMAX2 =很多,TERM3 =离开
    3. 将包含:TERM1 =书,MINIMUM1 = 1,MAXIMUM1 =许多,MINMAX1 = null,TERM2 =作者,MIN IMUM2 = NULL,MAXIMUM2 = NULL,MINMAX2 = 1,TERM3 =标题,MINIMUM2 = NULL,MAXIMUM2 = NULL,MINMAX2 = 3,TERM3 =书签

我创建了我想补一类与重复的部分在我的字符串值(MINIMUM2,MAXIMUM2,MINMAX和TERM3发言):

//MyObject contains the values of one expression from the repateatable part. 
public class MyObject 
{ 
    public string term { get; set; } 
    public string min { get; set; } 
    public string max { get; set; } 
    public string minmax { get; set; } 
} 

由于我的图案具有重复的部分(+)我想创建一个List,我添加了一个新的对象(MyObject),我想填写可复制组的值。

我的问题是我不知道如何填充我的对象与我的可重复部分的值。我尝试编写代码的方式是错误的,因为我的列表自从有一个 句子(例如'一本书包含1到多个作者和1个标题和3个书签'),没有相同数量的值。)从来没有一个MINIMUM2 ,每个可重复部分有一个MAXIMUM2和一个MINMAX2。

有没有更简单的方法来填充我的对象或我如何从我的量词部分获取值?

我的代码(在C#):

var match = Regex.Match(exampleText, pattern); 
if (match.Success) 
{ 

    string term1 = match.Groups["TERM1"].Value; 
    string minimum1 = match.Groups["MINIMUM1"].Value; 
    string maximum1 = match.Groups["MAXIMUM1"].Value; 
    string minmax1 = match.Groups["MINMAX1"].Value; 
    string term2 = match.Groups["TERM2"].Value; 

    //--> Groups[].Captures..ToList() might be wrong. Maybe there is a better way to get the values of the reapeatable Part 
    List<string> minimums2 = match.Groups["MINIMUM2"].Captures.Cast<Capture>().Select(x => x.Value).ToList<string>(); 
    List<string> maximums2 = match.Groups["MAXIMUM2"].Captures.Cast<Capture>().Select(x => x.Value).ToList<string>(); 
    List<string> minmaxs2 = match.Groups["MINMAX2"].Captures.Cast<Capture>().Select(x => x.Value).ToList<string>(); 
    List<string> terms3 = match.Groups["TERM3"].Captures.Cast<Capture>().Select(x => x.Value).ToList<string>(); 

    List<MyObject> myList = new List<MyObject>(); 

    for (int i = 0; i<terms3.Count; i++) 
    { 
     myList.Add(new MyObject() 
      { 
      term = terms3[i], 
      min = minimums2[i] //-->ERROR MIGHT HAPPEN when List<string>minimums2 doesn't have the same amount of values like List<string> terms3 
      max = maximums2[i] //-->ERROR.. 
      minmax = minmaxs2[i] //-->ERROR... 
      }); 
    } 
} 

回答

0

我可以用这个词“和”让我有一个字符串“splittedText”其中包含的每一个短语后,分割我exampleText解决我自己我的问题可重复的部分我的模式。

string[] splittedText = Regex.Split(exampleText, @"\sand\s"); 

分裂我exampleText之后我插入每个单独的短语,myObject的值在一个for循环,我做的是另regex.match得到我需要从每个短语的值。

string pattern2 =(((?#********RepeatablePart********)(?<MININUM2>(\d+))\sto\s(?<MAXIMUM2>(\d+|many){1})|(?<MINMAX2>(\d+|many{1}){1}){1})\s(?<TERM3>([A-Z][a-z]{1,})))+\.$ 
List<MyObject> myList = new List<MyObject>(); 

//i = 1 -> since splittedText[0] contains the beginning of the sentence (e.g. 'A Car consists of 2 to 5 Seats') 
for (int i = 1; i<splittedText.Count(); i++) 
{     
    var match2 = Regex.Match(splittedText[i], pattern2); 
    if (match2.Success) 
    {      
     myList.Add(new MyObject() 
     { 
      term = match2.Groups["TERM3"].Value,    
      min = match2.Groups["MININUM2"].Value, 
      max = match2.Groups["MAXIMUM2"].Value, 
      minmax = match2.Groups["MINMAX2"].Value 
     }); 

    } 
}