在文本文件中的某些字符串之间写入数据（最后一个元素错误）

因此，我有几个.txt文件，每个文件中有超过500.000行。在他们所有的我有我想要提取到他们自己的.txt文件的部分。在文本文件中的某些字符串之间写入数据（最后一个元素错误）

对于这个我用下面的代码：

for i, structure in enumerate(structures): 
    with open("data.txt", 'r') as f: 
     structure_data = open('data_new.txt'), 'w') 
     copy = False 
     for line in f: 
      if line.strip() == "Structure: {}".format(structures[i]): 
       structure_data.write(line) 
       copy = True 
      elif line.strip() == "Structure: {}".format(structures[i+1]): 
       copy = False 
      elif copy: 
       structure_data.write(line) 
    structure_data.close() 
f.close()

这里structures是的，是的名单，结构我有。

因此，基本上在每个.txt文件中都有一行说Structure: <some structure in the structures list>。我现在希望提取数据文件中两个字符串structures[i]和structures[i+1]之间的数据。在我上面的例子它这样做，并且我得到我想要的数据新的.txt文件，但是，我得到的.txt文件，我得到了以下错误：

elif line.strip() == "Structure: {}".format(structures[i+1]): 
IndexError: list index out of range

这样做的原因，据我所知，对于.txt文件的最后部分，没有“结束”Structure: <structure>，所以它不能设置copy = False。

因此，我确实得到了我想要的.txt文件输出，但正如您所知，没有什么更糟糕的代码有错误。那么有没有办法告诉它，如果没有这样的“终点线”，那么eveything是好的？

UPDATE：这是在data.txt的数据可能有点像：

Structure: TR 

Dose [cGy] Ratio of Total Structure Volume [%] 
     0      100 
    0.100619      100 
    0.2
    0.301857      100 
    0.402476      100 
    0.503096      100 
    0.603715      100 
    0.704334      100 
    0.804953      100 
    0.905572      100 

Structure: SV 


Dose [cGy] Ratio of Total Structure Volume [%] 
     0      100 
    0.100619      100 
    0.2
    0.301857      100 
    0.402476      100 
    0.503096      100 
    0.603715      100 
    0.704334      100 
    0.804953      100 
    0.905572      100 


Structure: DY 

Dose [cGy] Ratio of Total Structure Volume [%] 
     0      100 
    0.100619     88.2441 
    0.2.4882 
    0.301857     64.7324 
    0.402476     52.9765 
    0.503096     41.2206 
    0.603715     29.4647 
    0.704334     17.707 
    0.804953     17.6784 
    0.905572     17.6499

所以在structures名单我已经有结构在这种情况下TR，SV和DY。

所以在for line in f循环我想借此文/中Structures: structures[i]线和Structures: structures[i+1]并将其保存到一个文本文件之间的数据，然后再去做，直到structures名单已通过环。但如前所述，当我到达最后一部分时，没有结束Structures: structures[i+1]，因此我得到一个错误。这个错误是我想要避免的。

来源

2017-10-18 Denver Dang

你可以请包括一些样本输入和输出？我读了几次，我不确定我明白你想要做什么。 – roganjosh

在2秒内出现... –

一个简单的解决方案是简单地将一个虚拟structure添加到structures的末尾，该末尾不会出现在文件中的任何位置。然后你可以写你的循环是这样的：

for structure1, structure2 in zip(structures[:-1], structures[1:]):

这将遍历所有成对的连续结构。

另一种解决方案（避免使用虚设结构的）。将取代

elif line.strip() == "Structure: {}".format(structures[i+1]):

与

elif i+1 != len(structures) and line.strip() == "Structure: {}".format(structures[i+1]):

条件（这将导致误差）的第二部分将不评估第一部分是否为假。如果你决定使用这个版本中，你可能会想，你实际上并没有使用可变structure任何地方

for i in range(len(structures)):

更换

for i, structure in enumerate(structures):

。

来源

2017-10-18 19:19:11 Knoep

这样可以消除错误是的，但它不会“取走”最后一个结构和最后没有第二个结构的文本/数据...... –

@DenverDang对不起错过了。看到更新，我希望这一次，它是你想要的:) – Knoep

虚拟的东西是优秀的，也许是最简单的方法来做到这一点，即时通讯:)它现在的作品。谢谢！ –

在文本文件中的某些字符串之间写入数据（最后一个元素错误）

回答

相关问题