python re.findall奇怪的行为

>>> text =\ 
... """xyxyxy testmatch0 
... xyxyxy testmatch1 
... xyxyxy 
... whyisthismatched1 
... xyxyxy testmatch2 
... xyxyxy testmatch3 
... xyxyxy 
... whyisthismatched2 
... """ 
>>> re.findall("^\s*xyxyxy\s+([a-z0-9]+).*$", text, re.MULTILINE) 
[u'testmatch0', u'testmatch1', u'whyisthismatched1', u'testmatch2', u'testmatch3', u'whyisthismatched2']

所以我的期望是不匹配包含“whyisthismatched”的行。python re.findall奇怪的行为

Python的重新文档状态以下：

（圆点）在默认模式中，该除一个换行符任何字符匹配。如果已经指定了DOTALL标志，则该标志匹配任何包含换行符的字符。

我的问题是，如果这是真的预期的行为或错误。如果预计有人请解释为什么这些线路匹配，我应该如何修改我的模式来得到我期望的行为：

[u'testmatch0', u'testmatch1', u'testmatch2', u'testmatch3']

来源

2013-04-09 ZergRush

换行符可以包括在\ s的re.MULTILINE ......我觉得至少 – 2013-04-09 16:37:35

换行空格也尽可能的\s字符类关注。如果你想匹配空间只需要匹配[ ]代替：

>>> re.findall("^\s*xyxyxy[ ]+([a-z0-9]+).*$", text, re.MULTILINE) 
[u'testmatch0', u'testmatch1', u'testmatch2', u'testmatch3']

来源

2013-04-09 16:37:12

呸你快：P一如既往（感谢对于我的答案:)） – 2013-04-09 16:39:08

我刚刚意识到，感谢您的快速帮助。 – ZergRush 2013-04-09 16:39:57

python re.findall奇怪的行为

回答

相关问题