2014-11-24 54 views

回答

4

只需将整个字符串传递给collections.Counter() object即可计算每个字符。

这可能是更有效的通过线这样做行,所以就不需要太多的记忆:

from collections import Counter 

counts = Counter() 

with open('inputtextfilename') as infh: 
    for line in infh: 
     counts.update(line.strip()) 

str.strip()调用删除任何空白(如换行符)。

使用您的样品输入一个快速演示:

>>> from collections import Counter 
>>> sample = '''\ 
... TCCATCTACT 
... GGGCCTTCCT 
... TCCATCTACC 
... '''.splitlines(True) 
>>> counts = Counter() 
>>> for line in sample: 
...  counts.update(line.strip()) 
... 
>>> for letter, count in counts.most_common(): 
...  print(letter, count) 
... 
C 13 
T 10 
A 4 
G 3 

我用Counter.most_common() method(从多到少常见于顺序)获得信数对的排序列表。

+0

我收到一个错误:导入错误:无法导入名称“计数器” – MedaUser 2014-11-24 18:55:47

+0

没关系,已解决。谢谢 – MedaUser 2014-11-24 18:58:36