2016-12-06 96 views
0

我有一个需要大量单词的循环,将每个单词分解为字母并将它们附加到一个大列表中。减少循环的运行时间并提高效率

然后我检查出现最信,如果未出现在字符串中,我把它存储在具有两个空间的列表:

list[0] =字母出现的最

list[1] =多少次出现

这个循环是超级低效。它可以工作,但返回一个值需要大约25-30秒。在此之前,它会继续下去,不会返回任何值。

如何提高我编写的代码的效率?

def choose_letter(words, pattern): 
    list_of_letters = [] 
    first_letter = [] # first spot is the letter, second is how many times it appears 
    second_letter =[] # first spot is letter, second how many times it appears 
    max_appearances = ["letter", 0] 
    for i in range(len(words)): # splits up every word into letters 
     list_of_letters.append(list(words[i])) 
    list_of_letters = sum(list_of_letters, []) # concatenates the lists within the list 
    first_letter = list_of_letters.count(0) 
    for j in list_of_letters: 
     second_letter = list_of_letters.count(j) 
     if second_letter >= max_appearances[1] and j not in pattern: 
      max_appearances[0] = j 
      max_appearances[1] = second_letter 
     else: 
      list_of_letters.remove(j) 
    return max_appearances[0] 
+2

这可能是http://codereview.stackexchange.com/ – Blorgbeard

+1

更好的候选人,当你移到codereview时,他们会要求查看你对该代码运行的分析器的输出。 –

+2

看起来像['collections.Counter']的工作(https://docs.python.org/3.5/library/collections.html#collections.Counter)。 – bereal

回答

0

使其更快的一种方法是选择更好的数据结构。下面是使用collections.Counter一个例子:

from collections import Counter 

def choose_letter(words, pattern): 
    pattern = set(pattern) 
    letters = (letter 
       for word in words 
       for letter in word 
       if letter not in pattern) 
    letters = Counter(letters) 
    return letters.most_common(1)[0][0] 


mywords = 'a man a plan a canal panama'.split() 
vowels = 'aeiou' 
assert choose_letter(mywords, vowels) == 'n' 

下面是一个使用collections.defaultdict

from collections import defaultdict 

def choose_letter(words, pattern): 
    pattern = set(pattern) 
    counts = defaultdict(int) 
    for word in words: 
     for letter in word: 
      if letter not in pattern: 
       counts[letter] += 1 
    return max(counts, key=counts.get) 

mywords = 'a man a plan a canal panama'.split() 
vowels = 'aeiou' 
assert choose_letter(mywords, vowels) == 'n' 
0

你做了很多循环&,你不需要操作列表。每当您执行countnot in时,您都会强制程序循环查看列表/字符串以找到要查找的内容。从列表中删除所有这些项目也非常昂贵。一个更优雅的解决方案是循环访问一次单词/字母列表,然后使用字典来计算每个字母的出现次数。从那里,你有一个字符/字符对&你可以从那里得到键/值,排序清单&看前两个值。

from collections import defaultdict 
from itertools import chain 

def choose_letter(words, pattern=""): 
    count_dict = defaultdict(int) # all unknown values default to 0 
    for c in chain(*words): 
     count_dict[c] += 1 
    # you could replace this "not in" with something more efficient 
    filtered = [(char, count) for (char,count) in count_dict.items() if char not in pattern] 
    filtered.sort(lambda a,b: -cmp(a[0], b[0])) 
    print filtered 
    return filtered[0][0] 

如果你不希望深入论证拆包,itertools & defaultdicts,你可以只说:

count_dict = {} 
for word in words: 
    for char in word: 
     count_dict[char] = count_dict.get(char, 0) + 1 

...如果你不想尝试挖掘到的参数拆包然而。