内存错误与函数调用大量参数

这是一种试图被赋予的句子（种子库）和词的对字典的开始的话（对的列表之后创建一个乱码声明的程序），其中包含来自文本文件的关于哪些词遵循的信息。

一个text.txt文件包含'This is a cat。'的例子。他是一只狗。'将意味着我们会输入以下内容：

seedBank = ['This', 'He'] 

pairs = { 'This':['is'],'is':['a','a'],'a':['cat','dog'],'He':['is'] }

因此该函数使用这些输入来创建一个随机生成的文章，使模糊的意识，因为它遵循一个半语法正确的格式。

def gibberish_sentence(seedBank, pairs): 
    gibSentence = [] 
    gibSentence.append(random.choice(seedBank)) #random seed 
    x = gibSentence[0] 
    while(pairs.get(x)is not None): #Loop while value x is a key in the dictionairy 
     y = random.choice(pairs.get(x)) #random value of key x 
     gibSentence.append(y) #random value is added to main string 
     x = y #key x is reset to y 
    return ' '.join(gibSentence) #String

该问题：

这个程序能正常工作用于使小这样的句子具有一组限定的random.seed（值）上面的一个，但是它不能与给定一组时返回一个存储器错误输入（seedBank和pair）非常大。我的问题是，这个程序有什么问题可能会导致它在处理较大参数时遇到问题？

注意这些参数实际上并不是很大，我没有文本文档，但它不会太大以至于没有足够的内存。

错误代码：

enter image description here

太谢谢你了。

已解决：谢谢！这个问题已经解决了，事实上这是造成问题的一个条件，这是因为它循环遍历整个文本，而不仅仅是当它到达一个具有完整或问号的单词时结束。实质上，这导致它过载记忆，但谢谢大家帮忙！

来源

2015-03-19 Finn

多大的文本文件？在KB的顺序？ MB？ GB？另外，我认为我们需要看到调用代码 - 我敢打赌，您意外地制作了占用大量内存的副本。 – 2015-03-19 06:36:38

不幸的是，这是一个自动化的测试系统，但我通过电子邮件发送给我测试的人，所以我可以手动检查它，我认为这个问题可能与下面提到的无限循环有关，但我会考虑到这一点。该文本文件只有9.5KB，所以有些事情是非常错误的！ – Finn 2015-03-19 06:53:03

谢谢！问题得到了解决，事实上这是一种导致问题的条件，这是因为它循环遍历整个文本，而不仅仅是当它到达一个具有完全停顿或问号等的词时结束。本质上，这导致它以超负荷的记忆，但谢谢大家在这里帮助！ – Finn 2015-03-19 08:48:02

没有实际pairs这很难说，但有一个无限循环的可能性，如果所有的话在某个时候相互引用：

pairs = { 'someone':['thinks'],'thinks':['that','how'],'that':['someone','anyone'],'how':['someone'], 'anyone': ['thinks'] }

写不完。

来源

2015-03-19 06:37:48

这是一个有效的观点，我没有想到，因为该函数只是想生成一个句子，它应该完成时，该单词有一个。要么？要么！因为它是最后一个字符，所以我可能需要为此添加一个测试用例。不幸的是，这是自动测试系统的输出，我自己没有word_pairs_bank：/ 感谢您的回复！ – Finn 2015-03-19 06:51:50

加入字符串列表并不是最差的，但它在空间效率方面并不是最好的。

考虑使用（当然是未经测试）类似StringIO：

from cStringIO import StringIO 
import random 

def gibberish_sentence(seedBank, pairs): 
    seed = random.choice(seedBank) 
    gibSentence = StringIO() 
    gibSentence.write(seed)    #random seed 
    gibSentence.write(' ') 
    x = seed 
    while(pairs.get(x) is not None): #Loop while value x is a key in the dictionairy 
     y = random.choice(pairs.get(x)) #random value of key x 
     gibSentence.write(y)   #random value is added to main string 
     gibSentence.write(' ') 
     x = y       #key x is reset to y 
    return gibSentence.getvalue() #String

不同的字符串连接方法Here's a comparison，以每秒和内存占用操作方面。

来源

2015-03-19 06:43:58 jedwards

嗨，感谢您的回应！对不起，我很新的Python语法，所以我真的只是学习绳索，我不认为这个特殊问题是由串联引起的，但感谢效率提示！ – Finn 2015-03-19 06:49:08

正如Tim Pietzcker所说，如果在pairs中有一个循环，您的代码可以永久循环。这有一个最简单的例子：

>>> seedBank = ['and'] 
>>> pairs = {'and': ['on'], 'on': ['and']} 
>>> gibberish_sentence(seedBank, pairs) # just keeps going

您可以确保您生成的句子（最终）通过修改pairs字典，使其包含这个词的时候发生的最后一个句子中的一个哨兵值结束。例如用于源文本，如“你和我和狗。”：

seedBank = ['You'] 

pairs = { 
    'You': ['and'], 
    'and': ['me', 'the'], 
    'me': ['and'], 
    'the': ['dog'], 
    'dog': ['.'], 
}

...并增加在gibberish_sentence()为定点检查：

def gibberish_sentence(seedBank, pairs): 
    gibSentence = [] 
    gibSentence.append(random.choice(seedBank)) #random seed 
    x = gibSentence[0] 
    while(pairs.get(x)is not None): #Loop while value x is a key in the dictionairy 
     y = random.choice(pairs.get(x)) #random value of key x 
     if y == '.': 
      break 
     gibSentence.append(y) #random value is added to main string 
     x = y #key x is reset to y 
    return ' '.join(gibSentence) #String

...这给判决有机会终止：

>>> gibberish_sentence(seedBank, pairs) 
'You and the dog' 
>>> gibberish_sentence(seedBank, pairs) 
'You and me and me and me and me and me and the dog' 
>>> gibberish_sentence(seedBank, pairs) 
'You and me and the dog'

来源

2015-03-19 06:58:56

建立一个名单可以通过使用发电机，这是非常有效的内存来避免。

def gibberish_sentence(seedBank, pairs): 
    x = random.choice(seedBank)) #random seed 
    yield x 
    while(pairs.get(x)is not None): #Loop while value x is a key in the dictionairy 
     y = random.choice(pairs.get(x)) #random value of key x 
     yield y 
     x = y #key x is reset to y 

print ' '.join(gibberish_sentence(seedBank, pairs)) #String

或者字符串必须的功能，可以做这样内建成，

def gibberish_sentence(seedBank, pairs): 
    def words(): 
     x = random.choice(seedBank)) #random seed 
     yield x 
     while(pairs.get(x)is not None): #Loop while value x is a key in the dictionairy 
     y = random.choice(pairs.get(x)) #random value of key x 
     yield y 
     x = y #key x is reset to y 
    return ' '.join(words()) #String

来源

2015-03-19 07:17:33

内存错误与函数调用大量参数

回答

相关问题