请原谅我,如果这不是问这样一个问题的适当位置,但我最难的时候是想出一个可行的方法分割一些文本。以' n'和' n n'结尾的分裂线
这里是我试图分裂将文本样本:
[Thu Feb 2 12:45:38 2017][428423.3] (file_name:0xcb61) Invalid variable type
call stack:
-----------
[0cb61:+33] larray, r#26, fp(3),
[031ff:Mug::Request.preHandlers+17] refcall, fp(1), string#245, # from: fp(1)
[0339d:Mug::Request.process+77] call, addr(0x80001d), -, # Mug::Request.preHandlers()
[02ffd:Mug::Request.recv+93] call, addr(0x800026), -, # Mug::Request.process()
[02d03:Mug::Connection.on_client+101] refcall, fp(0), string#734, # from: fp(0)
[14a5b:+4] refcall, fp(-2), string#3103, # from: fp(-2)
[1e24a:main+9664] eop, -, -,
[Thu Feb 2 14:09:07 2017][428423.8] Warning: writing 0 byte file (/the_directory/) to tar archive
[Thu Feb 2 18:55:27 2017][449547.25] Warning: writing 0 byte file (/the_directory/) to tar archive
[Fri Feb 3 12:21:33 2017][451135.3] (file_name:0xcb61) Invalid variable type
call stack:
-----------
[0cb61:+33] larray, r#26, fp(3),
[031ff:Mug::Request.preHandlers+17] refcall, fp(1), string#245, # from: fp(1)
[0339d:Mug::Request.process+77] call, addr(0x80001d), -, # Mug::Request.preHandlers()
[02ffd:Mug::Request.recv+93] call, addr(0x800026), -, # Mug::Request.process()
[02d03:Mug::Connection.on_client+101] refcall, fp(0), string#734, # from: fp(0)
[14a5b:+4] refcall, fp(-2), string#3103, # from: fp(-2)
[1e24a:main+9664] eop, -, -,
正如你可以在上面看到,上面的文字并不真正适合任何类型的图案,并有一些错误扔空白的换行符,还有一些没有。理想情况下,我想最终是这样的...
[[Thu Feb 2 14:09:07 2017][428423.8] Warning: writing 0 byte file (/the_directory/) to tar archive], [Thu Feb 2 12:45:38 2017][428423.3] (file_name:0xcb61) Invalid variable type \ncall stack:\n-----------\n[0cb61:+33] larray, r#26, fp(3),\n[031ff:Mug::Request.preHandlers+17] refcall, fp(1), string#245, # from: fp(1)\n[0339d:Mug::Request.process+77] call, addr(0x80001d), -, # Mug::Request.preHandlers()\n[02ffd:Mug::Request.recv+93] call, addr(0x800026), -, # Mug::Request.process()\n[02d03:Mug::Connection.on_client+101] refcall, fp(0), string#734, # from: fp(0)\n[14a5b:+4] refcall, fp(-2), string#3103, # from: fp(-2)\n[1e24a:main+9664] eop, -, -,]
然后,我可以通过循环访问每个错误。现在我正在使用一些正则表达式来处理已购买的良好数据,然后扔掉调用堆栈,但是我希望能够尽可能地存储整个调用堆栈。
这里是我当前的代码:
with open(local_dump, 'r') as ifile:
for line in ifile:
filename_pattern = re.compile(r'\((\w*\.\w*)\:\w*\)\s(.*$)')
date_pattern = re.compile(r"^\[([a-zA-z]{3,})\s([a-zA-z]{3,})\s{2}(\d{1,2})\s(\d{1,2}\:\d{1,2}\:\d{1,2})\s(\d{4})\]\[\d*\.\d*\]\s(.*$)")
if re.search(date_pattern, line):
data = re.search(date_pattern, line)
if re.search(filename_pattern, (data[6])):
data = re.search(filename_pattern, (data[6]))
print("{0}: {1}".format(data.group(1),data.group(2)))
else:
if re.search("call stack", line.strip()):
print(line.strip())
我能得到这个近乎功能与代码块:
with open(local_dump, 'r') as ifile:
lines = ifile.read()
for line in lines.split('\n\n'):
print("LINE: " + line)
上面的代码没打出来的调用堆栈到自己线,但我遇到了问题,当行结束与'\ n':
LINE: [Thu Feb 2 12:45:38 2017][428423.3] (file_name:0xcb61) Invalid variable type
call stack:
-----------
[0cb61:+33] larray, r#26, fp(3),
[031ff:Mug::Request.preHandlers+17] refcall, fp(1), string#245, # from: fp(1)
[0339d:Mug::Request.process+77] call, addr(0x80001d), -, # Mug::Request.preHandlers()
[02ffd:Mug::Request.recv+93] call, addr(0x800026), -, # Mug::Request.process()
[02d03:Mug::Connection.on_client+101] refcall, fp(0), string#734, # from: fp(0)
[14a5b:+4] refcall, fp(-2), string#3103, # from: fp(-2)
[1e24a:main+9664] eop, -, -,
LINE: [Thu Feb 2 14:09:07 2017][428423.8] Warning: writing 0 byte file (/the_directory/) to tar archive
[Thu Feb 2 18:55:27 2017][449547.25] Warning: writing 0 byte file (/the_directory/) to tar archive
[Fri Feb 3 12:21:33 2017][451135.3] (file_name:0xcb61) Invalid variable type
call stack:
-----------
[0cb61:+33] larray, r#26, fp(3),
[031ff:Mug::Request.preHandlers+17] refcall, fp(1), string#245, # from: fp(1)
[0339d:Mug::Request.process+77] call, addr(0x80001d), -, # Mug::Request.preHandlers()
[02ffd:Mug::Request.recv+93] call, addr(0x800026), -, # Mug::Request.process()
[02d03:Mug::Connection.on_client+101] refcall, fp(0), string#734, # from: fp(0)
[14a5b:+4] refcall, fp(-2), string#3103, # from: fp(-2)
[1e24a:main+9664] eop, -, -,
这里是如何t分机看起来更原始的格式:
'[Thu Feb 2 14:09:07 2017][428423.8] Warning: writing 0 byte file (/the_directory/) to tar archive \n[Thu Feb 2 18:55:27 2017][449547.25] Warning: writing 0 byte file (/the_directory/) to tar archive \n[Fri Feb 3 12:21:33 2017][451135.3] (file_name:0xcb61) Invalid variable type \ncall stack:\n-----------\n[0cb61:+33] larray, r#26, fp(3), \n[031ff:Mug::Request.preHandlers+17] refcall, fp(1), string#245, # from: fp(1)\n[0339d:Mug::Request.process+77] call, addr(0x80001d), -, # Mug::Request.preHandlers()\n[02ffd:Mug::Request.recv+93] call, addr(0x800026), -, # Mug::Request.process()\n[02d03:Mug::Connection.on_client+101] refcall, fp(0), string#734, # from: fp(0)\n[14a5b:+4] refcall, fp(-2), sting#3103, # from: fp(-2)\n[1e24a:main+9664] eop, -, -, '
感谢您的任何提示,技巧和帮助您能够提供。
你试过'lines.split(” \ n“)'? – Peter
@PeterKuebler - 我有,对不起,我急于写出来。与此相关的问题是调用堆栈始终散布着'\ n',这经历了一些混乱。 –
啊,我没有意识到你想保留那些...而不是'for'循环,试着通过调用'lines.replace('\\','\\\\')来逃避\然后'打印(行)'没有for循环。 – Peter