在两个方向上搜索一个行号的字符串

使用Python 2.4，我正在阅读一个大的平面文件并选择一个特定的行号。现在我想在该行号之前搜索一个字符串，例如START，并在该行号之后搜索字符串END。在两个方向上搜索一个行号的字符串

如何获取最近出现的字符串START（当前行号码前）和END（当前行号码后）的行号？

2011-05-30 GPX

如何：

line_no = 1 

# Seek the last START before reaching the target line. 
start_line_no = -1 
while line_no != target_line_no: 
    line = input.readline() 
    if line == "": 
     # File is shorter than you think. 
     break 
    line_no += 1 
    if START in line: 
     start_line_no = line_no 

# Seek the first END after the target line.  
end_line_no = -1 
while true: 
    line = input.readline() 
    if line == "": 
     # END could not be found. 
     break 
    line_no += 1 
    if END in line: 
     end_line_no = line_no 
     break 

print start_line_no, end_line_no

来源

2011-05-30 11:07:33

感谢这个代码。但是，说我正在使用一个非常大的文件，我正在寻找的内容是在文件的中间。从一开始就搜索每一行不是很密集吗？有没有更有效的方法来做到这一点？ – GPX 2011-05-30 11:09:36

首先得到它正确，然后测量，然后看看是否需要优化。基于你所猜测的效率低下的过早优化是所有邪恶的根源。 – msw 2011-05-30 11:14:11

如果这些行的长度是任意的，那么没有比这更有效的方法（除了为将来的操作预处理文件）。顺序文件访问是一种超快速操作。在寻找棘手的优化之前，我会尝试一下。 – 2011-05-30 11:17:51

在两个方向上搜索一个行号的字符串

回答

相关问题