在Python中比较2个文件

-1

我想比较两个文件并在其后添加一些信息。在Python中比较2个文件

我的文件1的样子：

1 234 332  4 
2 345 435  6 
3 546 325  3 
4 984 493  9

我的文件2貌似

1 234 332  a b c d 
2 345 435  a b c d 
4 984 493  a b c d

而且我想下面的输出

1 234 332  4 a b c d 
2 345 435  6 a b c d 
4 984 493  9 a b c d

换句话说：我想比较第1,2和3列。如果它们相等，我想要第1列的第4列，然后是第i列的其余部分N文件2.

我写了下面的代码在Python：

with open('file1.txt') as f1, open('file2.txt') as f2, open('output_file.txt', 'w') as outfile: 

for line1, line2 in zip(f1, f2): 

columns1 = line1.rstrip("\n").split("\t") 

columns2 = line2.rstrip("\n").split("\t") 

    if columns1[0] == columns2[0] and columns1[1] == columns2[1] and columns1[2] == columns2[2]: 

    print >> outfile, columns2[0],columns2[1],columns2[2],columns1[3],columns2[3],columns2[4],columns2[5],columns2[6]

而且我得到以下结果：

1 234 332  4 a b c d 
2 345 435  6 a b c d

我的问题是，我的代码是由线

比较线

line1 with line1 

line2 with line2

当我的代码比较line3和line3时，它们不相等，程序停止。如何才能将line3与line4进行比较等等......如果第3行不匹配???

来源

2015-03-13 Pol

https://docs.python.org/3/library/difflib.html – davidism 2015-03-13 17:50:22

你的问题不是事实，你的代码是比较一行行，但你是不是每一行比较循环中的每一行。您需要嵌套循环才能交叉检查每个条目。 – Josh 2015-03-13 18:21:09

如果你不希望你的文件太大，为什么不直接遍历每行的每一行，然后在/如果你找到一个匹配的时候中断（假设你期望每行不超过一个匹配）。

f1 = open('file1.txt') 
f2 = open('file2.txt') 
outfile = open('output_file.txt', 'w') 

for line1 in f1: 
    columns1 = line1.rstrip("\n").split("\t") 
    for line2 in f2: 
     columns2 = line2.rstrip("\n").split("\t") 

     if columns1[0] == columns2[0] and columns1[1] == columns2[1] and columns1[2] == columns2[2]: 

      print >> outfile, columns2[0],columns2[1],columns2[2],columns1[3],columns2[3],columns2[4],columns2[5],columns2[6] 
      break 
f1.close() 
f2.close() 
outfile.close()

来源

2015-03-13 18:00:23 Malonge

您的回答对我来说似乎是正确的，但仍然无效（同样的问题，脚本停在匹配的最后一行，忘记了以下几行） – Pol 2015-03-16 08:43:44

在Python中比较2个文件

回答

相关问题