正则表达式匹配太多圆括号

我在Python中使用正则表达式从文本文件中提取数据元素。我遇到了太多括号的问题。正则表达式匹配太多圆括号

的文本存储在一个字符串名为temp和的形式为：

temp='Somethingorother School District (additional text)|other stuff here'

我目前使用

match = re.search(r'(.* School District) (\(.*\))\|?',temp)

伟大的工程和匹配

match.group(1) = Somethingorother School District 
match.group(2) = (additional text)

然而，有时候'其他东西'部分也包含括号，如下所示：

'Somethingorother School District (additional text)|$59900000 (4.7 mills)'

，所以我得到

match.group(2) = (additional text)|$59900000 (4.7 mills)

我明白，这是因为*运算符是贪婪的，但（附加文本）的部分是相当的特质，我想捕捉无论是在那些括号。换句话说，我希望它在这些括号内是贪婪的，但是一旦它匹配a就停止寻找）。有没有办法做到这一点？

来源

2014-11-21 corbinlm

最好的方法是用'[^]] *'代替'。*'，它会匹配任何东西，但会关闭'）'，所以当你遇到第一个''' – Tensibai 2014-11-21 15:09:11

使用negated character class。

>>> match = re.search(r'(.* School District) (\([^()]*\))\|?',temp) 
>>> match.group(1) 
'Somethingorother School District' 
>>> match.group(2) 
'(additional text)'

[^()]*任何字符，但不(或)零次或多次匹配。

DEMO

来源

2014-11-21 15:09:33

Demo [here]时会停止匹配。 http://regex101.com/r/pO2sU3/1）（以防万一它可能帮助它） – Tensibai 2014-11-21 15:11:43

感谢您的演示链接。 – 2014-11-21 15:13:38

将非贪婪的最后一个括号。

来源

2014-11-21 15:09:33 KeAn

正则表达式匹配太多圆括号

回答

相关问题