的Python for循环前瞻

我有一个Python for循环中，我需要向前看一个项目，看是否需要采取行动处理之前进行。的Python for循环前瞻

for line in file: 
    if the start of the next line == "0": 
     perform pre-processing 
     ... 
    continue with normal processing 
    ...

有没有简单的方法来做到这一点在python中？我目前的做法是将文件缓冲到一个数组，但是这并不理想，因为该文件相当大。

来源

2010-11-16 Mike

您可以得到任何可迭代的下一个项目预取与这个食谱：

from itertools import tee, islice, izip_longest 
def get_next(some_iterable, window=1): 
    items, nexts = tee(some_iterable, 2) 
    nexts = islice(nexts, window, None) 
    return izip_longest(items, nexts)

用法示例：

for line, next_line in get_next(myfile): 
    if next_line and next_line.startswith("0"): 
     ... do stuff

的代码可以让你的window参数，传入一个较大的值，如果你想要看两行或更多行。

来源

2010-11-16 18:55:00 nosklo

是否会导致它从文件中读取两次，或者是否以某种方式缓冲该行？ – Mike 2010-11-16 19:09:12

它只读一次。请参见['itertoolsmodule.c']中的'teedataobject_getitem'（http://svn.python.org/projects/python/branches/release27-maint/Modules/itertoolsmodule.c） – 2010-11-16 19:14:44

您的'get_next'在itertools receipes中['pairwise']（http://docs.python.org/library/itertools.html#recipes） – 2010-11-16 19:18:46

你可以有您储存前一行和过程，只要你读过的一行只给你的条件prev_line。

喜欢的东西：

prev_line = None 
for line in file: 
    if prev_line is not None and the start of the next line == "0": 
     perform pre-processing on prev_line 
     ... 
    continue with normal processing 
    ... 
    prev_line = line

您可能需要在必要时做额外的处理最后一行，这取决于你的逻辑。

来源

2010-11-16 18:51:01

你只需要缓冲一行。

for line in file: 
    if (prevLine is not None): 
    //test line as look ahead and then act on prevLine 
    prevLine = line

来源

2010-11-16 18:52:31 unholysampler

-2

我不是一个Python的专家，但我想像你需要使用2个回路这一点。 for循环的第一次运行应构建您需要执行特殊操作的索引列表。然后在第二次运行中，您可以将当前索引与列表进行比较，以确定是否需要执行该特殊操作。

来源

2010-11-16 18:54:27 Stephen

要想找到效率较低的方法很困难，尽管它可能存在:)干杯 – Morlock 2010-11-16 18:59:43

这应该工作了。我总是喜欢打电话next在第一轮设置something = None。

prev_line = next(the_file) 
for current_line in the_file: 
    if current_line.startswith('0'): 
     do_stuff(prev_line) 
    # continue with normal processing 
    # ... 
    prev_line = current_line

来源

2010-11-16 19:15:00

随着nosklo的答案的线，我倾向于使用以下模式：

从优秀itertools recipes功能pairwise是非常理想的：

from itertools import tee 

def pairwise(iterable): 
    "s -> (s0,s1), (s1,s2), (s2, s3), ..." 
    a, b = tee(iterable) 
    next(b, None) 
    return izip(a, b)

在代码中使用它得到我们：

for line, next_line in pairwise(file): 
    if next_line.startswith("0"): 
     pass #perform pre-processing 
     #... 
    pass #continue with normal processing

通常，对于这种类型的p处理（在迭代中的前瞻），我倾向于使用window function。成对的大小的窗口2.

来源

2010-11-16 19:21:11

是什么让你认为你的解决方案更好？我的方法有什么问题（使用'izip_longest'和'islice'）？我的解决方案允许更大的窗口更容易。 – nosklo 2010-11-16 20:08:59

@nosklo：试试吧。我相信你可以做得比我的链接更好，因为我只是在解释这个概念。我也可以做得更好。请注意，它并不像您想象的那样微不足道。我不确定我做了什么来解决你的问题，但我希望我们能为SO做出最好的单一答案。如果你编辑你的到达那里，我将会非常高兴。 – 2010-11-16 20:21:43

+1不要争辩。 – Danosaure 2010-11-16 20:40:22

more_itertools有几个lookahead tools的一个特例。这里我们将演示一些用于处理文件行的工具和抽象函数。鉴于：

f = """\ 
A 
B 
C 
0 
D\ 
""" 
lines = f.splitlines()

代码

import more_itertools as mit 


def iter_lookahead(iterable, pred): 
    # Option 1 
    p = mit.peekable(iterable) 
    try: 
     while True: 
      line = next(p) 
      next_line = p.peek() 
      if pred(next_line): 
       # Do something 
       pass 
      else: 
       print(line) 
    except StopIteration: 
     return 


pred = lambda x: x.startswith("0") 
iter_lookahead(lines, pred)

输出

A 
B 
0

下面是包括由@Muhammad Alkarouri提到pairwise和windowed工具等选项使用相同的库。

# Option 2 
for line, next_line in mit.pairwise(lines): 
    if pred(next_line):    
     # Do something 
     pass 
    else: 
     print(line) 

# Option 3 
for line, next_line in mit.windowed(lines, 2): 
    if pred(next_line):    
     # Do something 
     pass 
    else: 
     print(line)

后面的选项可以独立运行或替代先前函数中的逻辑。

来源

2017-08-08 01:01:54 pylang

的Python for循环前瞻

回答

相关问题