2014-04-02 28 views
1
s = "LEV606 (P), LEV230 (P)" 
#Expected result: ['LEV606', 'LEV230'] 

# First attempt 
In [3]: re.findall(r"[A-Z]{3}[0-9]{3}[ \(P\)]?", s) 
Out[3]: ['LEV606 ', 'LEV230 '] 

# Second attempt. The 'P' is not mandatory, can be other letter. 
# Why this doesn't work? 
In [4]: re.findall(r"[A-Z]{3}[0-9]{3}[ \([A-Z]{1}\)]?", s) 
Out[4]: [] 

# Third attempt 
# White space is still there. Why? I want to remove it from the answer 
In [5]: re.findall(r"[A-Z]{3}[0-9]{3}[\s\(\w\)]?", s) 
Out[5]: ['LEV606 ', 'LEV230 '] 

回答

0

您正在错误地使用[...]语法;这是一个角色类,可以匹配的字符。该类中列出的任何一个字符都是匹配的,因此无论是空格还是(字符,或者P);该空间将会很好地完成。

使用非捕获组而不是角色职业,使多余的文字可选,并为部分你想有一个捕获组:

re.findall(r"([A-Z]{3}[0-9]{3})(?: \(P\))?", s) 

演示:

>>> import re 
>>> s = "LEV606 (P), LEV230 (P)" 
>>> re.findall(r"([A-Z]{3}[0-9]{3})(?: \(P\))?", s) 
['LEV606', 'LEV230']