编辑文本文件时出错

我有一个像输入一样的大文件，每个4行对应于以@开头的行。第二行（在@之后）是一系列字符，对于某些ID，我们没有这一行。如果是这种情况，我想删除所有属于同一个ID的4行。
我也试过下面的代码在Python中，并给出了错误。编辑文本文件时出错

输入：

@M00872:361:000000000-D2GK2:1:1101:16003:1351 1:N:0:1 
ATCCGGCTCGGAGGA 
+ 
1AA?ADDDADDAGGG 
@M00872:361:000000000-D2GK2:1:1101:15326:1352 1:N:0:1 
GCGCAGCGGAAGCGTGCTGGG 
+ 
CCCCBCDCCCCCGGEGGGGGG 
@M00872:361:000000000-D2GK2:1:1101:16217:1352 1:N:0:1 

+

输出：

@M00872:361:000000000-D2GK2:1:1101:16003:1351 1:N:0:1 
ATCCGGCTCGGAGGA 
+ 
1AA?ADDDADDAGGG 
@M00872:361:000000000-D2GK2:1:1101:15326:1352 1:N:0:1 
GCGCAGCGGAAGCGTGCTGGG 
+ 
CCCCBCDCCCCCGGEGGGGGG 


import fileinput 

with fileinput.input(files="4415_pool.fastq", inplace=True, backup="file.bak") as f: 
    for l in f: 
     if l.strip().startswith("@"): 
      c = 2 
      next_line = f.readline().strip() 
      if not next_line: 
       while c:   
        c -= 1 
        try: 
         next(f) 
        except StopIteration: 
         break 
      else: 
       print(l.strip()) 
       print(next_line.strip()) 
       while c: 
        c -= 1 
        try: 
         print(next(f).strip()) 
        except StopIteration: 
         break

，但没有工作，给了这个错误：

Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
AttributeError: FileInput instance has no attribute '__exit__'

你知道如何解决这个问题？

来源

2017-05-29 ARM

你正在使用哪个python版本？我认为这是旧版本不支持fileinput与。因此，使用'f = fileinput.input（files =“4415_pool_TCP_Ctrl.fastq”，inplace = True，backup =“file.bak”） –

python的版本是：2.7 – ARM

看起来好像fileinput.FileInput类不执行__exit__()如果您想在with fileinput.input()..语句中使用它，则需要该方法。

来源

2017-05-29 13:32:15 Shai

我认为问题是Python版本（2.7），它不支持的FileInput到with

使用

f = fileinput.input(files="4415_pool.fastq", inplace=True, backup="file.bak")

相反

with fileinput.input(files="4415_pool.fastq", inplace=True, backup="file.bak") as f

来源

2017-05-29 13:35:22

虽然有说法是在2.5加入，我不不认为fileinput被移植到使用它（contextlib？）。

你的代码将在python3中工作，但不在2.7中。要解决此问题，要么使用PY 3或端口的代码来遍历线，如：

with open(filename, "r") as f: 
     lines = f.readlines() 

    for line in lines: 
     #do whatever you need to do for each line.

来源

2017-05-29 13:36:59 jnvilo

至于你的问题的解决方案（2.7），我会做这样的事情：

# Read all the lines in a buffer 
with open('input.fastq', 'r') as source: 
    source_buff = iter(source.readlines()) 

with open('output.fastq', 'w') as out_file: 
    for line in source_buff: 
    if line.strip().startswith('@'): 
     prev_line = line 
     line = next(source_buff) 

     if line.strip(): 
     # if the 2nd line is not empty write the whole block in the output file 
     out_file.write(prev_line) 
     out_file.write(line) 
     out_file.write(next(source_buff)) 
     out_file.write(next(source_buff)) 
     else: 
     pass

我知道.fastq文件有时可能会非常大，所以我不建议读取缓冲区中的整个文件，而是将这些代码放在一个循环中，每次读取4行（或块的行数）。

来源

2017-05-29 13:45:45 TasosGlrs

编辑文本文件时出错

回答

相关问题