2017-02-03 60 views
0

请原谅我,如果这不是问这样一个问题的适当位置,但我最难的时候是想出一个可行的方法分割一些文本。以' n'和' n n'结尾的分裂线

这里是我试图分裂将文本样本:

[Thu Feb 2 12:45:38 2017][428423.3] (file_name:0xcb61) Invalid variable type 
call stack: 
----------- 
[0cb61:+33] larray, r#26, fp(3), 
[031ff:Mug::Request.preHandlers+17] refcall, fp(1), string#245, # from: fp(1) 
[0339d:Mug::Request.process+77] call, addr(0x80001d), -, # Mug::Request.preHandlers() 
[02ffd:Mug::Request.recv+93] call, addr(0x800026), -, # Mug::Request.process() 
[02d03:Mug::Connection.on_client+101] refcall, fp(0), string#734, # from: fp(0) 
[14a5b:+4] refcall, fp(-2), string#3103, # from: fp(-2) 
[1e24a:main+9664] eop, -, -, 

[Thu Feb 2 14:09:07 2017][428423.8] Warning: writing 0 byte file (/the_directory/) to tar archive 
[Thu Feb 2 18:55:27 2017][449547.25] Warning: writing 0 byte file (/the_directory/) to tar archive 
[Fri Feb 3 12:21:33 2017][451135.3] (file_name:0xcb61) Invalid variable type 
call stack: 
----------- 
[0cb61:+33] larray, r#26, fp(3), 
[031ff:Mug::Request.preHandlers+17] refcall, fp(1), string#245, # from: fp(1) 
[0339d:Mug::Request.process+77] call, addr(0x80001d), -, # Mug::Request.preHandlers() 
[02ffd:Mug::Request.recv+93] call, addr(0x800026), -, # Mug::Request.process() 
[02d03:Mug::Connection.on_client+101] refcall, fp(0), string#734, # from: fp(0) 
[14a5b:+4] refcall, fp(-2), string#3103, # from: fp(-2) 
[1e24a:main+9664] eop, -, -, 

正如你可以在上面看到,上面的文字并不真正适合任何类型的图案,并有一些错误扔空白的换行符,还有一些没有。理想情况下,我想最终是这样的...

[[Thu Feb 2 14:09:07 2017][428423.8] Warning: writing 0 byte file (/the_directory/) to tar archive], [Thu Feb 2 12:45:38 2017][428423.3] (file_name:0xcb61) Invalid variable type \ncall stack:\n-----------\n[0cb61:+33] larray, r#26, fp(3),\n[031ff:Mug::Request.preHandlers+17] refcall, fp(1), string#245, # from: fp(1)\n[0339d:Mug::Request.process+77] call, addr(0x80001d), -, # Mug::Request.preHandlers()\n[02ffd:Mug::Request.recv+93] call, addr(0x800026), -, # Mug::Request.process()\n[02d03:Mug::Connection.on_client+101] refcall, fp(0), string#734, # from: fp(0)\n[14a5b:+4] refcall, fp(-2), string#3103, # from: fp(-2)\n[1e24a:main+9664] eop, -, -,] 

然后,我可以通过循环访问每个错误。现在我正在使用一些正则表达式来处理已购买的良好数据,然后扔掉调用堆栈,但是我希望能够尽可能地存储整个调用堆栈。

这里是我当前的代码:

with open(local_dump, 'r') as ifile: 
    for line in ifile: 
     filename_pattern = re.compile(r'\((\w*\.\w*)\:\w*\)\s(.*$)') 
     date_pattern = re.compile(r"^\[([a-zA-z]{3,})\s([a-zA-z]{3,})\s{2}(\d{1,2})\s(\d{1,2}\:\d{1,2}\:\d{1,2})\s(\d{4})\]\[\d*\.\d*\]\s(.*$)") 
     if re.search(date_pattern, line): 
      data = re.search(date_pattern, line) 
      if re.search(filename_pattern, (data[6])): 
       data = re.search(filename_pattern, (data[6])) 
       print("{0}: {1}".format(data.group(1),data.group(2))) 
     else: 
      if re.search("call stack", line.strip()): 
       print(line.strip()) 

我能得到这个近乎功能与代码块:

with open(local_dump, 'r') as ifile: 
     lines = ifile.read() 
     for line in lines.split('\n\n'): 
      print("LINE: " + line) 

上面的代码没打出来的调用堆栈到自己线,但我遇到了问题,当行结束与'\ n':

LINE: [Thu Feb 2 12:45:38 2017][428423.3] (file_name:0xcb61) Invalid variable type 
call stack: 
----------- 
[0cb61:+33] larray, r#26, fp(3), 
[031ff:Mug::Request.preHandlers+17] refcall, fp(1), string#245, # from: fp(1) 
[0339d:Mug::Request.process+77] call, addr(0x80001d), -, # Mug::Request.preHandlers() 
[02ffd:Mug::Request.recv+93] call, addr(0x800026), -, # Mug::Request.process() 
[02d03:Mug::Connection.on_client+101] refcall, fp(0), string#734, # from: fp(0) 
[14a5b:+4] refcall, fp(-2), string#3103, # from: fp(-2) 
[1e24a:main+9664] eop, -, -, 
LINE: [Thu Feb 2 14:09:07 2017][428423.8] Warning: writing 0 byte file (/the_directory/) to tar archive 
[Thu Feb 2 18:55:27 2017][449547.25] Warning: writing 0 byte file (/the_directory/) to tar archive 
[Fri Feb 3 12:21:33 2017][451135.3] (file_name:0xcb61) Invalid variable type 
call stack: 
----------- 
[0cb61:+33] larray, r#26, fp(3), 
[031ff:Mug::Request.preHandlers+17] refcall, fp(1), string#245, # from: fp(1) 
[0339d:Mug::Request.process+77] call, addr(0x80001d), -, # Mug::Request.preHandlers() 
[02ffd:Mug::Request.recv+93] call, addr(0x800026), -, # Mug::Request.process() 
[02d03:Mug::Connection.on_client+101] refcall, fp(0), string#734, # from: fp(0) 
[14a5b:+4] refcall, fp(-2), string#3103, # from: fp(-2) 
[1e24a:main+9664] eop, -, -, 

这里是如何t分机看起来更原始的格式:

'[Thu Feb 2 14:09:07 2017][428423.8] Warning: writing 0 byte file (/the_directory/) to tar archive \n[Thu Feb 2 18:55:27 2017][449547.25] Warning: writing 0 byte file (/the_directory/) to tar archive \n[Fri Feb 3 12:21:33 2017][451135.3] (file_name:0xcb61) Invalid variable type \ncall stack:\n-----------\n[0cb61:+33] larray, r#26, fp(3), \n[031ff:Mug::Request.preHandlers+17] refcall, fp(1), string#245, # from: fp(1)\n[0339d:Mug::Request.process+77] call, addr(0x80001d), -, # Mug::Request.preHandlers()\n[02ffd:Mug::Request.recv+93] call, addr(0x800026), -, # Mug::Request.process()\n[02d03:Mug::Connection.on_client+101] refcall, fp(0), string#734, # from: fp(0)\n[14a5b:+4] refcall, fp(-2), sting#3103, # from: fp(-2)\n[1e24a:main+9664] eop, -, -, ' 

感谢您的任何提示,技巧和帮助您能够提供。

+0

你试过'lines.split(” \ n“)'? – Peter

+0

@PeterKuebler - 我有,对不起,我急于写出来。与此相关的问题是调用堆栈始终散布着'\ n',这经历了一些混乱。 –

+0

啊,我没有意识到你想保留那些...而不是'for'循环,试着通过调用'lines.replace('\\','\\\\')来逃避\然后'打印(行)'没有for循环。 – Peter

回答

2

您可以拆分\n,然后删除空行。

input = "your input" 
list = input.split("\n") 
list = filter(None, list) 

如果你只是想从日志中的所有错误消息,您可以尝试:

matches = re.finditer(r"\[.*?\]\[.*\]\s*(.*)$", input, re.MULTILINE) 
for match in matches: 
    print("Error: " + match.group(1)) 

假设所有的错误都被两个[...]组之前

+0

尽管这并没有给我准确的“调用堆栈”错误,但这是一个很好的回应,并且帮助我大幅度提高。将此标记为答案,谢谢! –