2013-05-06 98 views
2

我有一个计算Python中每个键的不同值的问题。蟒蛇字典的唯一值计数

我有一个字典d像

[{"abc":"movies"}, {"abc": "sports"}, {"abc": "music"}, {"xyz": "music"}, {"pqr":"music"}, {"pqr":"movies"},{"pqr":"sports"}, {"pqr":"news"}, {"pqr":"sports"}] 

我需要单独打印每每个键的不同值的数目。

这意味着我将要打印

abc 3 
xyz 1 
pqr 4 

请帮助。

谢谢

+2

你的意思是你有字典的名单?还是它复制不正确? – thegrinner 2013-05-06 20:05:39

+1

这不是一本字典,它至多是一个字典列表(它只包含一个键/值对) - 真的吗?这是什么样的数据结构?我猜它实际上是'[{“abc”:“电影”},...,对吧? – 2013-05-06 20:05:55

+0

@TimPietzcker没错。对不起,代表性错误 – user1189851 2013-05-06 20:06:55

回答

7

使用collections.Counter() instance,一些链接在一起:

from collections import Counter 
from itertools import chain 

counts = Counter(chain.from_iterable(e.keys() for e in d)) 

这保证了在你的输入列表中有多个键的字典进行正确计数。

演示:

>>> from collections import Counter 
>>> from itertools import chain 
>>> d = [{"abc":"movies"}, {"abc": "sports"}, {"abc": "music"}, {"xyz": "music"}, {"pqr":"music"}, {"pqr":"movies"},{"pqr":"sports"}, {"pqr":"news"}, {"pqr":"sports"}] 
>>> Counter(chain.from_iterable(e.keys() for e in d))Counter({'pqr': 5, 'abc': 3, 'xyz': 1}) 

或与输入的词典多个键:

>>> d = [{"abc":"movies", 'xyz': 'music', 'pqr': 'music'}, {"abc": "sports", 'pqr': 'movies'}, {"abc": "music", 'pqr': 'sports'}, {"pqr":"news"}, {"pqr":"sports"}] 
>>> Counter(chain.from_iterable(e.keys() for e in d))                    Counter({'pqr': 5, 'abc': 3, 'xyz': 1}) 

Counter()具有附加的,有益的功能,例如,该目录排序的元件,反向其计数.most_common() method订购:

for key, count in counts.most_common(): 
    print '{}: {}'.format(key, count) 

# prints 
# 5: pqr 
# 3: abc 
# 1: xyz 
+0

请注意,计数器](http://docs.python.org/2/library/collections.html#collections.Counter)类是在Python 2.7中引入的。有[backport](http://code.activestate.com/recipes/576611-counter-class/)。我想你[知道这件事](Martijn)(http://stackoverflow.com/a/13311111/566644)。 – 2013-05-06 20:14:50

+0

@ LauritzV.Thaulow:以其他方式作为[backport for 2.5 and 2.6](http://code.activestate.com/recipes/576611-counter-class/)。 – 2013-05-06 20:17:20

+0

...或者你可以在int中使用'defaultdict'。 – 2013-05-06 20:18:23

2
>>> d = [{"abc":"movies"}, {"abc": "sports"}, {"abc": "music"}, {"xyz": "music"}, 
... {"pqr":"music"}, {"pqr":"movies"},{"pqr":"sports"}, {"pqr":"news"}, 
... {"pqr":"sports"}] 
>>> from collections import Counter 
>>> counts = Counter(key for dic in d for key in dic.keys()) 
>>> counts 
Counter({'pqr': 5, 'abc': 3, 'xyz': 1}) 
>>> for key in counts: 
...  print (key, counts[key]) 
... 
xyz 1 
abc 3 
pqr 5 
3

什么你所描述的 - 与每个键多个值的列表 - 可以由下面得到更好的可视化这样的:

{'abc': ['movies', 'sports', 'music'], 
'xyz': ['music'], 
'pqr': ['music', 'movies', 'sports', 'news'] 
} 

在这种情况下,你必须做一些更多的工作要插入:

看到
  1. 查找键,如果它已经存在
    • 如果不存在,创建具有价值[](空单)新的密钥
  2. 检索VALU E(与钥匙相关联的列表)
  3. 使用if value in,看是否被检查的值存在于列表
  4. 如果新值不在,.append()

这也导致了简单的方法来统计存储的元素总数:

# Pseudo-code 
for myKey in myDict.keys(): 
    print "{0}: {1}".format(myKey, len(myDict[myKey]) 
1

使用collections.Counter。假设您有一个项目词典的列表...

from collections import Counter 
listOfDictionaries = [{'abc':'movies'}, {'abc':'sports'}, {'abc':'music'}, 
    {'xyz':'music'}, {'pqr':'music'}, {'pqr':'movies'}, 
    {'pqr':'sports'}, {'pqr':'news'}, {'pqr':'sports'}] 
Counter(list(dict)[0] for dict in zzz) 
4

不需要使用计数器。您可以通过这种方式实现:

# input dictionary 
d=[{"abc":"movies"}, {"abc": "sports"}, {"abc": "music"}, {"xyz": "music"}, {"pqr":"music"}, {"pqr":"movies"},{"pqr":"sports"}, {"pqr":"news"}, {"pqr":"sports"}] 

# fetch keys 
b=[j[0] for i in d for j in i.items()] 

# print output 
for k in list(set(b)): 
    print "{0}: {1}".format(k, b.count(k)) 
+0

这比使用计数器更快。 – akashdeep 2013-05-08 08:49:36

+0

是的,计数器有一些性能问题http://stackoverflow.com/questions/27801945/surprising-results-with-python-timeit-counter-vs-defaultdict-vs-dict – sashab 2015-07-09 10:45:10

1

大厦@akashdeep解决方案,它采用了一套,但给出了一个错误的结果,因为没有在问题中提到的“明显”的要求计算(pqr应该是4,不是5 )。

# dictionary 
d=[{"abc":"movies"}, {"abc": "sports"}, {"abc": "music"}, {"xyz": "music"}, {"pqr":"music"}, {"pqr":"movies"},{"pqr":"sports"}, {"pqr":"news"}, {"pqr":"sports"}] 

# merged dictionary 
c = {} 
for i in d: 
    for k,v in i.items(): 
     try: 
      c[k].append(v) 
     except KeyError: 
      c[k] = [v] 

# counting and printing 
for k,v in c.items(): 
    print "{0}: {1}".format(k, len(set(v))) 

这会给出正确的:

xyz: 1 
abc: 3 
pqr: 4