2016-11-16 82 views
1

我是初学者寻求一些帮助。我正在尝试编写一个python程序,它会从.txt文件返回一个列表。显示具有不同字符长度的单词的数量。例如,“在列表中有五个单词,三个或更少的字符。”计数列表中的字符数python

这是我到目前为止有:

def count_lengths(text): 

    up_to_three = 0 
    four_or_five = 0 
    six_to_nine = 0 
    ten_or_more = 0 
    newtext = text.split(' ') 

def main(): 

    filename = "gb.txt" 
    text = readfile(filename) 
    word_lengths = count_lengths(text) 
    print(word_lengths) 

.txt文件转换成一个列表后,我几乎失去了。有人可以帮我解决这个问题吗?

回答

0

也许最简单的办法是使用Counter

from collections import Counter 

text = 'Some text from your file that you have read into this variable' 

    print(sorted(map(len, text.split()))) 

    word_lengths = {} 

    # cumulate number of words 
    total = 0 
    for k,v in sorted(Counter(map(len, text.split())).items()): 
     total += v 
     word_lengths[k] = total 


    print(word_lengths) 
    # {8: 12, 3: 1, 4: 11} 
+0

你知道的更好:P你需要先排序:P –

+0

@JoranBeasley谢谢。 Soring added – Marcin

0

使用collections.Counter将产生与该键作为字长度和值作为在每个长度字的数目的dict样的对象。

>>> s = 'hello this is a sentence with words of varying lengths' 

首先,跟踪所有的字长:

>>> lengths = [len(word) for word in s.split()] 
>>> lengths 
[5, 4, 2, 1, 8, 4, 5, 2, 7, 7] 

然后,算多少单词串以上发生在不同的长度:

>>> from collections import Counter 
>>> word_lengths = Counter(lengths) 
>>> word_lengths 
Counter({2: 2, 4: 2, 5: 2, 7: 2, 1: 1, 8: 1}) 

编辑:既然你想要累计总和,试试这个:

def count_lengths(text, n): 
    lengths = [len(word) for word in text.split()] 
    word_lengths = Counter(lengths) 
    # count the total number of words with lengths less than or equal to n 
    n_and_less_chars = sum(value for key, value in word_lengths.items() if key <= n) 
    return n_and_less_chars 

尝试出来:

>>> print(count_lengths(s, 5)) 
7 

如果我们看一下上面的例子串中,我们可以看到有,实际上,7个字具有5个字符或更少。

+0

OPs需要累计总和,即有5个长度小于等于5的单词。 – Marcin