正则表达式来捕获'/ etc/services'

我想从我的UNIX机器上的\etc\services文件捕获一些信息，但我捕获了错误的值，同时也使得它过于复杂。正则表达式来捕获'/ etc/services'

我现在有

with open('/etc/services') as ports_file: 
    lines = ports_file.readlines() 
    for line in lines: 
     print re.findall('((\w*\-*\w+)+\W+(\d+)\/(tcp|udp))', line)

但它产生不正确的值这样的内容：

[('dircproxy\t57000/tcp', 'dircproxy', '57000', 'tcp')] 
[('tfido\t\t60177/tcp', 'tfido', '60177', 'tcp')] 
[('fido\t\t60179/tcp', 'fido', '60179', 'tcp')]

我希望它是这样的：

[('dircproxy', '57000', 'tcp')] 
[('tfido', '60177', 'tcp')] 
[('fido', '60179', 'tcp')]

我觉得这我的正则表达式需要(\w*\-*\w+)+，因为有些是def像这样的this-should-capture

来源

2017-10-12 Ludisposed

删除外部圆括号。 –

@WiktorStribiżew对不起，我吸食正则表达式。非常感谢 – Ludisposed

在这里使用正则表达式有什么特别的理由吗？看起来更像是一个'split（）'的工作。 –

我建议从不同的角度来看这个：不匹配字段值，匹配它们之间的分隔符。

print re.split(r'[\s/]+', line.split('#', 1)[0])[:3]

第一line.split('#', 1)[0]删除意见（文件中的第一#后的任何东西）。

来源

2017-10-12 19:57:59

它个人不会在这里使用正则表达式。查看下面的解决方案，并尝试查看它是否符合您的需求（还请注意，您可以直接遍历文件对象）：

services = [] 
with open('/etc/services') as serv: 
    for line in serv: 
     l = line.split() 
     if len(l) < 2: 
      continue 
     if '/tcp' in l[1] or '/udp' in l[1]: 
      port, protocol = l[1].split('/') 
      services.append((l[0], port, protocol))

来源

2017-10-12 20:06:05 iknownothing

正则表达式来捕获'/ etc/services'

回答

相关问题