2017-05-27 92 views
0

我有文件类似下面(temp1目录文件):搜索字符串,并打印从一个线以上在Python另一个搜索字符串

Basket1 
10 Pens I have in Packet1 
20 Books I have in Packet1 
30 Red pens I have in Packet1 
End here 
Basket1 
10 apples I have in Packet2 
20 Mangos I have in Packet2 
30 oranges I have in Packet2. 
End here 

我已经写了下面的代码,它将搜索之间的起始行和终止行并打印包括开始和结束行。

start_line = "Pens I have" 
end_line = "End here" 
print_lines = False 
with open('temp1' , 'r') as f: 
    for line in f: 
     line = line.strip() 
     if (re.search(start_line, line)): 
      print_lines = True 
     if print_lines: 
      temp = open("temp2", 'a') 
      sys.stdout = temp 
      print line 
     if (re.search(end_line, line)): 
      print_lines = False 
      temp.close() 
      sys.stdout = sys.__stdout__ 

输出我得到:

10 Pens I have in Packet1 
20 Books I have in Packet1 
30 Red pens I have in Packet1 
End here  

我需要帮助印刷线条从上面的文件从开始逐行TEMP2到行尾。以下是文件temp2的预期输出。

Basket1 
10 Pens I have in Packet1 
20 Books I have in Packet1 
30 Red pens I have in Packet1 
End here 
+0

请注明你所面对 – JkShaw

+0

喜JkShaw问题/问题,我需要打印到文件,从上面开始行到结束行之一。现在我只能打印从开始行到结束行 – kitty

+0

你已经打印行到'temp2' –

回答

0

您可以使用正则表达式来搜索字符串,用它来读取和写入文件,你可以这样做:

import re 

with open('temp1' , 'r') as f1, open('temp2' , 'a') as f2: 
    results = re.findall('\w+\n10 Pens I.*?End here', f1.read(), re.DOTALL) 
    f2.writelines(results) 

例子:

import re 

s = '''Basket1 
10 Pens I have in Packet1 
20 Books I have in Packet1 
30 Red pens I have in Packet1 
End here 
Basket1 
10 apples I have in Packet2 
20 Mangos I have in Packet2 
30 oranges I have in Packet2. 
End here''' 

# use re.findall if you want to match multiple times 
result = re.search('\w+\n10 Pens I.*?End here', s, re.DOTALL) 

# only print(result) if using re.findall 
print(result.group()) 

# output: 

Basket1 
10 Pens I have in Packet1 
20 Books I have in Packet1 
30 Red pens I have in Packet1 
End here 
0

由于您需要打印Basket1,因此您的start_line必须为Basket1,并且在之后你需要Pens I have我已经用它作为“mid_line”行,

import sys 
import re 

start_line = "Basket1" 
mid_line = "Pens I have" 
end_line = "End here" 
print_lines = False 

start_index = None 
start_data = None 
temp = None 

with open('temp1' , 'r') as f: 
    for index, line in enumerate(f): 
     line = line.strip() 

     # Search for start_line, and store it's index and value 
     if (re.search(start_line, line)): 
      start_data = line 
      start_index = index 

     # If you find "Pens I have", and it's under start_line then store start_line 
     if (re.search(mid_line, line)): 
      if start_index + 1 == index: 
       temp = open("temp2", 'a') 
       sys.stdout = temp 
       print start_data 
       print_lines = True 
     if print_lines: 
      temp = open("temp2", 'a') 
      sys.stdout = temp 
      print line 
     if (re.search(end_line, line)): 
      print_lines = False 
      if temp and hasattr(temp, 'read'): 
       temp.close() 
      sys.stdout = sys.__stdout__ 
+0

我得到了temp.close() NameError:名称'temp'没有定义 – kitty

+0

实际上它会发生,当模式将不会被发现。不过,更新解决方案以处理这种情况,请参阅更新的解决方案。 – JkShaw