比较两个文本文件（顺序并不重要）和输出的字的两个文件有共同到第三个文件

我刚开始编程，我试图来比较两个文件看起来像这样：比较两个文本文件（顺序并不重要）和输出的字的两个文件有共同到第三个文件

file1: 
tootsie roll 
apple 
in the evening 

file2: 
hello world 
do something 
apple 

output: 
"Apple appears x times in file 1 and file 2"

我真的难住了。我试图创建字典，列表，元组，集合，我似乎无法得到我想要的输出。我得到的最接近的是输出的行完全如file1/file2所示。

我已经尝试了几个代码片段，我似乎无法得到任何他们输出我想要的。任何帮助将不胜感激！！

这是我试过的最后一段代码，它没有给我任何输出给我的第三个文件。

f1 = open("C:\\Users\\Cory\\Desktop\\try.txt", 'r') 
f2 = open("C:\\Users\\Cory\\Desktop\\match.txt", 'r') 
output = open("C:\\Users\\Cory\\Desktop\\output.txt", 'w') 

file1 = set(f1) 
file2 = set(f2) 
file(word,freq) 
for line in f2: 
    word, freq = line.split() 
    if word in words: 
     output.write("Both files have the following words: " + file1.intersection(file2)) 
f1.close() 
f2.close() 
output.close()

来源

2015-11-20 Cory Gottfried

你到底要什么输出？ – vincent

我希望我的第三个文件具有与文件中匹配的每个单词的输出（例如，如果apple是文件1中的任何位置，apple是文件2中的任何位置，则会得到Apple的输出：x（x = number的时间苹果出现在这两个文件），然后我想知道这个词在这两个文件中有多少。 –

你并不需要所有这些循环 - 如果文件很小（即小于几百MB），你可以与他们的工作更直接：

words1 = f1.read().split() 
words2 = f2.read().split() 
words = set(words1) & set(words2)

words后会有一个set包含这些文件共有的所有单词。在分割文本之前，您可以使用lower()来忽略大小写。

要让每个单词的计数，你在评论提到，只需使用count()方法：

with open('outfile.txt', 'w') as output: 
    for word in words: 
     output.write('{} appears {} times in f1 and {} times in f2.\n'.format(word, words1.count(word), words2.count(word))

来源

2015-11-20 02:20:15 TigerhawkT3

比较两个文本文件（顺序并不重要）和输出的字的两个文件有共同到第三个文件

回答

相关问题