2015-02-12 58 views
1

我试图从文本文件做直串匹配。有时匹配意味着一个字符串包含多个目标字符串。目前,我有代码看起来像如何测试匹配是否存在于字符串中,如果匹配可以是单个字符串或字符串的元组?

interesting_matches = [ 
    "sys/bootdisk.py", 
    " engine stalled for ", 
    " changed to stalled)", 
    "DSR failure", 
    "Detected IDI failure", 
    "idi_shallow_verify_failure", 
    "Malformed block history", 
    "Out of order sequence message on", 
    "Port reset timeout of", 
    "gmp_info", 
    "test_thread", 
    " panic @ time ", 
    ": *** FAILED ASSERTION", 
    "filesystem full",] 

    for match in interesting_matches: 
     # Iterate through simple matches. 
     if match in line: 
      processed_line_data = self._process_line(
       match, 
       line, 
       line_datetime, 
       line_num, 
       current_version) 

    if "kern_sig" in line and "pid" in line: 
     processed_line_data = self._process_line(
      ("kern_sig", "pid"), 
      line, 
      line_datetime, 
      line_num, 
      current_version) 

    if "vfs_export" in line and "ignoring" in line: 
     processed_line_data = self._process_line(
      ("vfs_export", "ignoring"), 
      line, 
      line_datetime, 
      line_num, 
      current_version) 

    if "job_d" in line and 
     "State transition from state " in line and 
     " took longer than " in line: 
     processed_line_data = self._process_line(
      (
       "job_d", 
       "state transition from state", 
       " took longer than "), 
      line, 
      line_datetime, 
      line_num, 
      current_version) 

    if processed_line_data is not None: 
     return_list.append(processed_line_data) 

我很想做的是类似的东西,以

interesting_matches = [ 
     "sys/bootdisk.py", 
     " engine stalled for ", 
     " changed to stalled)", 
     "DSR failure", 
     "Detected IDI failure", 
     "idi_shallow_verify_failure", 
     "Malformed block history", 
     "Out of order sequence message on", 
     "Port reset timeout of", 
     "gmp_info", 
     "test_thread", 
     " panic @ time ", 
     ": *** FAILED ASSERTION", 
     "filesystem full", 
     ("kern_sig", "pid"), 
     ("vfs_export", "ignoring"), 
     ("job_d", "State transition from state", " took longer than "),] 

for matches in interesting_matches 
    if any(match in line for match in matches): 
      processed_line_data = self._process_line(
       match, 
       line, 
       line_datetime, 
       line_num, 
       current_version) 

但元组和字符串的混合导致的值错误,指出你不能比较字符串和元组。

如果我想要单个和多个字符串进行检查,如何编写单个比较?

编辑:

这里的工作代码基于肖恩的答案

interesting_matches = [ 
    ("sys/bootdisk.py",), 
    (" engine stalled for ",), 
    (" changed to stalled)",), 
    ("DSR failure",), 
    ("Detected IDI failure",), 
    ("idi_shallow_verify_failure",), 
    ("Malformed block history",), 
    ("Out of order sequence message on",), 
    ("Port reset timeout of",), 
    ("gmp_info",), 
    ("test_thread",), 
    (" panic @ time ",), 
    (": *** FAILED ASSERTION",), 
    ("filesystem full",), 
    ("kern_sig", "pid"), 
    ("vfs_export", "ignoring"), 
    ("job_d", "State transition from state", " took longer than "),] 

for matches in interesting_matches: 
    if all(match in "test_thread" for match in matches): 
     print(matches) 
+0

术语“签名”是指Python中特定的内容,您可能会使用它来混淆人。请参阅https://www.python.org/dev/peps/pep-0362/ – cdarke 2015-02-12 07:31:16

+0

对新术语有任何建议吗? – AlexLordThorsen 2015-02-12 07:31:53

+2

子串会更好 – 2015-02-12 07:40:03

回答

1

非常可行的。试试这个:对于单个子字符串签名,无论如何都将它们包装在一个元组中。现在你的名单是同质的,你的问题变得更简单了。对于每个签名元组,检查签名中的所有子字符串是否也在行中。

+0

'((“sys/bootdisk.py”))'''str' – AlexLordThorsen 2015-02-12 07:47:46

+2

@Rawrgulmuffins 1个元素的元组需要一个尾随逗号。 – FMc 2015-02-12 07:48:36

+0

@FMc这是一个有趣的小窍门。 = P – AlexLordThorsen 2015-02-12 07:49:42