2016-04-24 99 views
1

我有以下的正则表达式:正则表达式:匹配所有到可选捕获组

(.*)(?:([\+\-\*\/])(-?\d+(?:\.\d+)?)) 

目的是捕捉到的数学表达式的形式(左式)(操作员)(右操作数),例如1+2+3将被捕获为(1+2)(+)(3)。它还将处理单个操作数,例如(1)(+)(2)将捕获到1+2

我遇到的问题是这个正则表达式不会在没有操作符的单个操作数上匹配,例如, 5应在第一个捕获组中与第二个和第三个(5)()()中的任何内容匹配。如果我使最后一部分可选:

(.*)(?:([\+\-\*\/])(-?\d+(?:\.\d+)?))? 

那么初始组将始终捕获整个表达式。有什么办法可以让第二部分是可选的,但是它优先于第一组完成的贪婪匹配吗?

+0

为什么不使用的解析器? – markspace

+0

您是否尝试过* lazy *匹配? [(?+)'(?:([ - + * \ /])( - \ d +(:??\ \ d +)))'?](https://regex101.com/r/rS0yX6/2) –

+0

制作第一组懒惰匹配不起作用,如果我这样做,我不能完全解释我得到的结果,但它不是我想要的。 – DaveJohnston

回答

1

说明

此正则表达式将:

  • 捕获数学表达式高达上次操作
  • 捕获上次操作
  • 捕获在数学表达式
  • 假设的最后一个号码每个数字可能会有一个正号或负号,表示该号码是正数或负数
  • 假定每个数字可能是非整数
  • 假设数学表达式可以包含任何数量如操作:1+21+2+31+2+3+41+2+3+4...
  • 验证串是一个数学表达式。有一些边缘情况在这里没有说明,例如使用圆括号或其他复杂的数学符号。

原正则表达式

注意这是Java中,你需要躲避回斜线在这个正则表达式。为了逃避它们,只需将\全部替换为\\即可。

^(?=(?:[-+*/^]?[-+]?\d+(?:[.]\d+)?)*$)([-+]?[0-9.]+$|[-+]?[0-9.]+(?:[-+*/^][-+]?[0-9.]+)*(?=[-+*/^]))(?:([-+*/^])([-+]?[0-9.]+))?$

说明

Regular expression visualization

概述

在此表达我首先确认该字符串仅由运营-+/*^的,可选的迹象-+,和整数或非整数数字。由于已被验证,所以其余表达式可简单地将数字称为[0-9.]+,这提高了可读性。

捕捉组

0获取整个字符串 1获取整个字符串高达但不包括最后的操作,如果没有操作,那么第1组将对整个字符串 2获取最后一个操作如果它存在 3获取的编号和最后操作之后登录

NODE      EXPLANATION 
---------------------------------------------------------------------- 
^      the beginning of the string 
---------------------------------------------------------------------- 
    (?=      look ahead to see if there is: 
---------------------------------------------------------------------- 
    (?:      group, but do not capture (0 or more 
          times (matching the most amount 
          possible)): 
---------------------------------------------------------------------- 
     [-+*/^]?     any character of: '-', '+', '*', '/', 
           '^' (optional (matching the most 
           amount possible)) 
---------------------------------------------------------------------- 
     [-+]?     any character of: '-', '+' (optional 
           (matching the most amount possible)) 
---------------------------------------------------------------------- 
     \d+      digits (0-9) (1 or more times 
           (matching the most amount possible)) 
---------------------------------------------------------------------- 
     (?:      group, but do not capture (optional 
           (matching the most amount possible)): 
---------------------------------------------------------------------- 
     [.]      any character of: '.' 
---------------------------------------------------------------------- 
     \d+      digits (0-9) (1 or more times 
           (matching the most amount possible)) 
---------------------------------------------------------------------- 
    )?      end of grouping 
---------------------------------------------------------------------- 
    )*      end of grouping 
---------------------------------------------------------------------- 
    $      before an optional \n, and the end of 
          the string 
---------------------------------------------------------------------- 
)      end of look-ahead 
---------------------------------------------------------------------- 
    (      group and capture to \1: 
---------------------------------------------------------------------- 
    [-+]?     any character of: '-', '+' (optional 
          (matching the most amount possible)) 
---------------------------------------------------------------------- 
    [0-9.]+     any character of: '0' to '9', '.' (1 or 
          more times (matching the most amount 
          possible)) 
---------------------------------------------------------------------- 
    $      before an optional \n, and the end of 
          the string 
---------------------------------------------------------------------- 
    |      OR 
---------------------------------------------------------------------- 
    [-+]?     any character of: '-', '+' (optional 
          (matching the most amount possible)) 
---------------------------------------------------------------------- 
    [0-9.]+     any character of: '0' to '9', '.' (1 or 
          more times (matching the most amount 
          possible)) 
---------------------------------------------------------------------- 
    (?:      group, but do not capture (0 or more 
          times (matching the most amount 
          possible)): 
---------------------------------------------------------------------- 
     [-+*/^]     any character of: '-', '+', '*', '/', 
           '^' 
---------------------------------------------------------------------- 
     [-+]?     any character of: '-', '+' (optional 
           (matching the most amount possible)) 
---------------------------------------------------------------------- 
     [0-9.]+     any character of: '0' to '9', '.' (1 
           or more times (matching the most 
           amount possible)) 
---------------------------------------------------------------------- 
    )*      end of grouping 
---------------------------------------------------------------------- 
    (?=      look ahead to see if there is: 
---------------------------------------------------------------------- 
     [-+*/^]     any character of: '-', '+', '*', '/', 
           '^' 
---------------------------------------------------------------------- 
    )      end of look-ahead 
---------------------------------------------------------------------- 
)      end of \1 
---------------------------------------------------------------------- 
    (?:      group, but do not capture (optional 
          (matching the most amount possible)): 
---------------------------------------------------------------------- 
    (      group and capture to \2: 
---------------------------------------------------------------------- 
     [-+*/^]     any character of: '-', '+', '*', '/', 
           '^' 
---------------------------------------------------------------------- 
    )      end of \2 
---------------------------------------------------------------------- 
    (      group and capture to \3: 
---------------------------------------------------------------------- 
     [-+]?     any character of: '-', '+' (optional 
           (matching the most amount possible)) 
---------------------------------------------------------------------- 
     [0-9.]+     any character of: '0' to '9', '.' (1 
           or more times (matching the most 
           amount possible)) 
---------------------------------------------------------------------- 
    )      end of \3 
---------------------------------------------------------------------- 
)?      end of grouping 
---------------------------------------------------------------------- 
    $      before an optional \n, and the end of the 
          string 
---------------------------------------------------------------------- 

实例

示例文字

1+2+-3 

获取样本组

[0] = 1+2+-3 
[1] = 1+2 
[2] = + 
[3] = -3 

在线演示:http://fiddle.re/b2w5wa

示例文字

-3 

获取样本组

[0] = -3 
[1] = -3 
[2] = 
[3] = 

在线演示:http://fiddle.re/07kqra

Java代码示例,

import java.util.regex.Pattern; 
import java.util.regex.Matcher; 
class Module1{ 
    public static void main(String[] asd){ 
    String sourcestring = "source string to match with pattern"; 
    Pattern re = Pattern.compile("^(?=(?:[-+*/^]?[-+]?\\d+(?:[.]\\d+)?)*$)([-+]?[0-9.]+$|[-+]?[0-9.]+(?:[-+*/^][-+]?[0-9.]+)*(?=[-+*/^]))(?:([-+*/^])([-+]?[0-9.]+))?$",Pattern.CASE_INSENSITIVE); 
    Matcher m = re.matcher(sourcestring); 
    int mIdx = 0; 
    while (m.find()){ 
     for(int groupIdx = 0; groupIdx < m.groupCount()+1; groupIdx++){ 
     System.out.println("[" + mIdx + "][" + groupIdx + "] = " + m.group(groupIdx)); 
     } 
     mIdx++; 
    } 
    } 
} 
+0

完美的谢谢。 – DaveJohnston

+0

实际上,这与输入只是单个操作数(例如单个数字或小数,正数或负数-3.1)的情况不符。在这种情况下,我希望在捕获组1中匹配这一点,而组2和3将是空的。任何想法,如果这是可能的?我可以在代码中轻松做到这一点,在提取其余的代码之前先执行一个额外的步骤来检查单个操作数,但在一个正则表达式中执行它会很好并且干净。 – DaveJohnston

+0

@DaveJohnston,当然。我已经更新了我的答案,以符合您的情况,表达式可能只有一个项目。 –