2013-02-25 183 views
0

我想从python中使用正则表达式提取某些文字,但我无法得到它。我原来的文件看起来像Python文件正则表达式匹配

List/VB 
[ the/DT flights/NNS ] 
from/IN 

,我想输出是

List VB 
the DT 
flights NNS 
from IN 

我写了下面的代码:

import re 

with open("in.txt",'r') as infile, open("out.txt",'w') as outfile: 
    for line in infile: 
     if (re.match(r'(?:[\s)?(\w\\\w)',line)): 
      outfile.write(line) 

回答

2

与您提供的样本数据:

>>> data = """List/VB 
... [ the/DT flights/NNS ] 
... from/IN""" 

>>> expr = re.compile("(([\w]+)\/([\w]+))", re.M) 
>>> for el in expr.findall(data): 
>>>  print el[1], el[2] 
List VB 
the DT 
flights NNS 
from IN 
+0

我的输出打印为阵列,如何使一个字符串? – 2013-02-26 00:47:34

+0

你的意思是你想要将el [1]和el [2]转换成单个字符串吗?在这种情况下,你可以做s =“%s%s”%el [1:3] – 2013-02-26 01:11:28

0
import re 

expr = re.compile("(([\w]+)\/([\w]+))", re.M) 
fp = open("file_list.txt",'r') 
lines = fp.read() 
fp.close() 
a = expr.findall(lines) 
for el in expr.findall(lines): 
    print ' '.join(el[1:]) 

输出:

List VB 
the DT 
flights NNS 
from IN 
+0

你应该制定你的答案。 – Beppe 2013-09-21 21:29:00