Python的提取模式匹配

的Python 2.7.1 我试图使用python的正则表达式来提取模式中的单词Python的提取模式匹配

我有一些字符串，它看起来像这样

someline abc 
someother line 
name my_user_name is valid 
some more lines

我想提取单词“my_user_name”。我做类似

import re 
s = #that big string 
p = re.compile("name .* is valid", re.flags) 
p.match(s) #this gives me <_sre.SRE_Match object at 0x026B6838>

现在如何提取my_user_name？

来源

2013-03-11 Kannan Ekanath

你需要从正则表达式捕获。 search为模式，如果找到，则使用group(index)检索字符串。假设执行有效的检查：

>>> p = re.compile("name (.*) is valid") 
>>> p.search(s) # The result of this is referenced by variable name '_' 
<_sre.SRE_Match object at 0x10555e738> 
>>> _.group(1)  # group(1) will return the 1st capture. 
'my_user_name'

来源

2013-03-11 14:09:16 SuperSaiyan

工作你确定这不是第一场比赛的“组（0）”吗？ – sharshofski 2015-04-16 14:04:34

+10

有点迟了，但是有，也没有。 'group（0）'返回匹配的文本，而不是第一个捕获组。代码评论是正确的，而你似乎混淆捕获组和匹配。 'group（1）'返回第一个捕获组。 – andrewgu 2015-08-07 01:31:48

您可以使用匹配的组：

p = re.compile('name (.*) is valid')

例如

>>> import re 
>>> p = re.compile('name (.*) is valid') 
>>> s = """ 
... someline abc 
... someother line 
... name my_user_name is valid 
... some more lines""" 
>>> p.findall(s) 
['my_user_name']

这里我用re.findall而不是re.search得到的my_user_name所有实例。使用re.search，你需要拿到赛对象从该组数据：

>>> p.search(s) #gives a match object or None if no match is found 
<_sre.SRE_Match object at 0xf5c60> 
>>> p.search(s).group() #entire string that matched 
'name my_user_name is valid' 
>>> p.search(s).group(1) #first group that match in the string that matched 
'my_user_name'

正如在评论中提到，你可能想使你的正则表达式非贪婪：

p = re.compile('name (.*?) is valid')

只能拿起'name '和下' is valid'（之间的东西，而不是让你的正则表达式到论坛中拿起其他' is valid'。

来源

2013-03-11 14:08:05 mgilson

这是可能的非贪婪匹配需要...（除非用户名可以是多个单词......） – 2013-03-11 14:10:19

@JonClements - 你的意思是'（。*？）'？是的，这是可能的，虽然没有必要，除非我们使用're.DOTALL' – mgilson 2013-03-11 14:11:51

耶 - 're.findall（'name（。*）is valid'，'name jon clements is valid is valid is valid'）'probably won不会产生预期的结果... – 2013-03-11 14:13:22

你想要一个capture group。

p = re.compile("name (.*) is valid", re.flags) # parentheses for capture groups 
print p.match(s).groups() # This gives you a tuple of your matches.

来源

2013-03-11 14:10:40

你可以使用这样的事情：

import re 
s = #that big string 
# the parenthesis create a group with what was matched 
# and '\w' matches only alphanumeric charactes 
p = re.compile("name +(\w+) +is valid", re.flags) 
# use search(), so the match doesn't have to happen 
# at the beginning of "big string" 
m = p.search(s) 
# search() returns a Match object with information about what was matched 
if m: 
    name = m.group(1) 
else: 
    raise Exception('name not found')

来源

2013-03-11 14:11:48 Apalala

也许这是一个有点短，更容易理解：

import re 
text = '... someline abc... someother line... name my_user_name is valid.. some more lines' 
>>> re.search('name (.*) is valid', text).group(1) 
'my_user_name'

来源

2017-04-19 14:59:56 John

Python的提取模式匹配

回答

相关问题