多行正则表达式组匹配

我想解析使用正则表达式的模板格式。多行正则表达式组匹配

下面是一个示例

Type of Change:     Modify 
Metavance:      None 
AutoSys :      None 
Informatica Migration:   None 
FTP Details:     None 
Device/Server:     DWEIHPRD 
DB Objects:      Delete 
           ARC_MEDICAL_CLAIM_DETAIL_FK1 
DB Name:      DWEIHPRD 
Schema-Table(s):   UTIL 
Interface(s):      IF0515 
Reports (RAPS):    None 
Ancillary Systems:   None

基本上一切

字段：数据（可能是多行，如在DB对象上面的例子）

^(.+?):(.*)

是相当接近做什么我想要的，除了它只抓住DB对象的第一行。如果我打开dotall，那么所有内容都会匹配贪婪，并且所有内容都位于“第一个字段”结果中。

理想情况下，字段和数据中的额外空白都将被修剪，但如果它不是作为正则表达式的一部分发生，那么这不是一个大问题。

作为一个额外的麻烦，我有做这项工作在获得97 VBScript中，所以它可能是一些更好的现代正则表达式功能可能无法使用:(

来源

2012-04-25 Jason Coyne

必须在您使用正则表达式？ – ant 2012-04-25 21:01:20

没有。如果在vbscript解决方案中有一些其他的容易实现，那很好。 – 2012-04-25 21:21:20

注意：这是一个丑陋的解决方案，但也许它会帮助你。正如@anubhava所建议的，可能有一个非正则表达式的解决方案，我只是不知道VBA是否足够说明它可能是什么。

根据这个article VBScript for Microsoft Office支持lookaheads，lookbehinds和non-capturing（文章上的日期是2009），但是如果支持回溯到Access 97，我会感到非常惊讶 - 虽然我确实d是错的。

通常情况下，我会使用lookahead和这个非捕获组，但避免使用它们，因为它们不太可能在Office 97中受支持。所以请注意，您将只需忽略捕获组3（仅在那里在多行匹配上测试可选的行尾字符）。请注意，这只会查找传播两行的匹配项。

^(.+):\s+(.+)(\r\n\s+(.+))* 
note this has four capture groups, but you will ignore \3. Use \1, \2, and \4 (four will be empty for single line matches)

解释：

^   # beginning of line 
(.+):  # capture one or more characters up to a colon 
\s+(.+) # skip past whitespace, then capture characters up to end of line 
(  # open a capturing group (to be thrown away. See explanation above) 
    \r\n\s+ # peek ahead to see if there are EOL characters followed by whitespace 
    (.+) # if we got this far, capture whatever characters come after the whitespace 
)*  # and make this group optional (and you will ignore it anyway)

来源

2012-04-25 21:37:29 alan

多行正则表达式组匹配

回答

相关问题