2017-04-19 60 views
0

我试图做一个生成器,它可以返回列表中只有一个索引“移动”的多个连续项目。类似于DSP中的移动平均滤波器。举例来说,如果我有列表:返回组项目的Python生成器

l = [1,2,3,4,5,6,7,8,9] 

我希望这样的输出:

[(1,2,3),(2,3,4),(3,4,5),(4,5,6),(5,6,7),(6,7,8),(7,8,9)] 

我做的代码,但它不带过滤器和发电机等,恐怕也将打破由于工作记忆,如果我需要提供一个大的单词列表。

功能gen

def gen(enumobj, n): 
    for idx,val in enumerate(enumobj): 
     try: 
      yield tuple(enumobj[i] for i in range(idx, idx + n)) 
     except: 
      break 

和示例代码:

words = ['aaa','bb','c','dddddd','eeee','ff','g','h','iiiii','jjj','kk','lll','m','m','ooo'] 
w = filter(lambda x: len(x) > 1, words) 

# It's working with list 
print('\nList:') 
g = gen(words, 4) 
for i in g: print(i) 

# It's not working with filetrs/generators etc. 
print('\nFilter:') 
g = gen(w, 4) 
for i in g: print(i) 

名单不会产生任何东西。代码应该中断,因为无法索引过滤器对象。当然,其中一个答案是强制列表:list(w)。但是,我正在寻找更好的函数代码。我怎样才能改变它,使函数可以接受过滤器等等。我担心内存中列表中的大量数据。

谢谢

回答

1

使用迭代器,您需要跟踪已读取的值。一个n大小的清单是诀窍。将下一个值附加到列表中,并在每个收益率后放弃顶部项目。

import itertools 

def gen(enumobj, n): 
    # we need an iterator for the `next` call below. this creates 
    # an iterator from an iterable such as a list, but leaves 
    # iterators alone. 
    enumobj = iter(enumobj) 
    # cache the first n objects (fewer if iterator is exhausted) 
    cache = list(itertools.islice(enumobj, n)) 
    # while we still have something in the cache... 
    while cache: 
     yield cache 
     # drop stale item 
     cache.pop(0) 
     # try to get one new item, stopping when iterator is done 
     try: 
      cache.append(next(enumobj)) 
     except StopIteration: 
      # pass to emit progressively smaller units 
      #pass 
      # break to stop when fewer than `n` items remain 
      break 

words = ['aaa','bb','c','dddddd','eeee','ff','g','h','iiiii','jjj','kk','lll','m','m','ooo'] 
w = filter(lambda x: len(x) > 1, words) 

# It's working with list 
print('\nList:') 
g = gen(words, 4) 
for i in g: print(i) 

# now it works with iterators 
print('\nFilter:') 
g = gen(w, 4) 
for i in g: print(i) 
+0

嗨。我需要询问一些事情。我是否正确地说前两行是初始化发生器,并且只在发生器开始工作时才运行?它看起来像迭代需要它们第一次切片对象。在''data':'和'data'为什么会检查什么样的条件?对我来说,该函数通过'yield data'返回暂停的值,然后代码返回到下一行继续。那是对的吗?抱歉,我是制作发电机的新手。谢谢 – Celdor

+0

我已将'pass'改为'break'。否则,函数会返回长度为n-1,n-2,...,1的多余项目。 – Celdor

+0

如果迭代器的起始项少于“n”项,这也可能是个问题。你可以在'islice'后面加一个检查并立即返回。 – tdelaney