2017-10-18 85 views
0

我有两个文本文件:一个文本来自文章,另一个带有phrasal verbs列表。我试图在文章中找到每个短语动词的每个实例。我知道文章包含短语动词“登录”,短语动词列表也是这样。当我循环使用短语动词并使用re.findall()搜索每个动词时,它找不到任何动词。当我在短语动词列表的第1199行手动启动循环时,它恰好是单词“登录”,它会找到它。当我刚刚开始时,它只是一行,在1198行,它没有找到它。这里是我的代码:re.findall()找不到另一个文件中的文件行

import re 
PV_HI = [] 
file = open('article.txt') 
for line in open('phrasalVerbs.txt'): 
    pv = line.strip() 
    pvFound = re.findall(pv, file.read(), flags=re.I) 
    PV_HI.extend(pvFound) 
print(PV_HI) 

这里是动词短语列表的文本文件的样本:

Lock onto 
Lock out 
Lock up 
Lock away 
Log in 
Log into 
Log off 
Log on 
Log out 
Look after 
Look back 
Look down on 
Look for 
Look forward to 
Look in 
Look in on 
Look into 

以及文章文件的样本:

<p> If you have a business account, a higher Pay Anyone limit up to $500,000 and also have a Security Device to authorise third party payments and/or can add Operators, you are an ANZ Internet Banking for Business customer. 
<p> How do I manage my accounts once I am registered for ANZ Internet Banking? 
<p> If you have registered for ANZ Internet Banking, use your CRN and password to log on to ANZ Internet Banking. 
<p> If you need help while logged on to ANZ Internet Banking, click the " Help " icon in the top right hand corner of all pages. 

最终,我什么试图做的是获得一组1600个文件中所有短语动词的计数。如果有更好的方法来做到这一点,我肯定会接受建议。

谢谢!

马特

回答

0

我保存的短语动词的样本和文章文件(附加“登录”到底人物找到),然后用你的Python代码做一些测试。一开始我也找不到任何结果。但是,当我更改代码如下:

import re 
PV_HI = [] 
with open('article.txt', 'r') as f: 
    article_content = f.read() 
    for line in open('phrasalVerbs.txt'): 
     pv = line.strip() 
     pvFound = re.findall(pv, article_content, flags=re.I) 
     PV_HI.extend(pvFound) 
    print(PV_HI) 

它的工作原理和成功找到'登录'。希望能帮助到你。

+0

哇!太棒了,非常感谢!我想我会注意到,当我注释掉'article_content = f.read()'并使用'f.read()'作为're.findall()'的字符串参数时,它不起作用,因此将' f.read()'这个变量在这里至关重要。再次感谢! – MattR

+0

很高兴帮助! :d –

相关问题