2016-03-02 105 views
1

我正面临一个相当难以捉摸的错误,这似乎是由读取文件造成的。 我简化了我的程序来演示这个问题:读取文件导致的错误

考虑这个程序正常工作:

import re 

sourceString="Yesterday I had a pizza for lunch it was tasty\n"; 
sourceString+="today I am thinking about grabbing a burger and tomorrow it\n"; 
sourceString+="will probably be some fish if I am lucky\n\n\n"; 
sourceString+="see you later!" 

jj=["pizza","steak","fish"] 

for keyword in jj: 
    regexPattern= keyword+".*"; 
    patternObject=re.compile(regexPattern,re.MULTILINE); 
    match=patternObject.search(sourceString); 
    if match: 
     print("Match found for "+keyword) 
     print(match.group()+"\n") 
    else: 
     print("warning: no match found for :"+ keyword+"\n") 

我用一个很简单的正则表达式,但我从我的阵列jj获得正则表达式的要点

脚本按预期工作(含匹配“比萨”和“鱼”模式,但不匹配“牛排”)

现在在我实际的程序我想读这些KEYW从文件ORDS(我不想在源进行硬编码)

到目前为止,我有这样的:

import re 

sourceString="Yesterday I had a pizza for lunch it was tasty\n"; 
sourceString+="today I am thinking about grabbing a burger and tomorrow it\n"; 
sourceString+="will probably be some fish if I am lucky\n\n\n"; 
sourceString+="see you later!" 

with open('keyWords.txt','r') as f: 
    for keyword in f: 
     regexPattern= keyword+".*"; 
     patternObject=re.compile(regexPattern,re.MULTILINE); 
     match=patternObject.search(sourceString); 
     if match: 
      print("Match found for "+keyword) 
      print(match.group()) 
     else: 
      print("warning: no match found for :"+ keyword) 

其中keyWords.txt将包含以下内容:

pizza 
steak 
fish 

但这会破坏代码,因为不知何故,只有文件中的LAST关键字才能成功匹配(如果匹配存在)。

什么给?

+0

不要只假定它是一个错误。这仅仅是因为每一行在最后都有一个换行符,你没有考虑到。 – zondo

+1

..这意味着由于我没有考虑换行符程序有一个错误的权利?我没有说语言规范有缺陷 – ForeverStudent

+0

我很抱歉;我误解了。 – zondo

回答

3
with open('keyWords.txt','r') as f: 
    for keyword in f: 
     regexPattern = keyword.strip() + ".*"; 

使用strip()keyword删除任何newline字符。如果你确实知道不会有任何领先的空白,那么rstrip()就足够了。

+0

这解决了这个问题。很努力 – ForeverStudent