2015-12-14 231 views
1

辞典键的嵌套列表组我有基础的嵌套列表上艰难的时间,分组标识(S)辞典键如何在蟒蛇

下面的代码是基于工作对我来说,组ID和ST值位置

null='' 
dataset={"users": [ 
    {"id": 20, "loc": "Chicago", "st":"4", "sectors": [{"sname": "Retail"}, {"sname": "Manufacturing"}, {"sname": null}]}, 
    {"id": 21, "loc": "Frankfurt", "st":"4", "sectors": [{"sname": null}]}, 
    {"id": 22, "loc": "Berlin", "st":"6", "sectors": [{"sname": "Manufacturing"}, {"sname": "Banking"},{"sname": "Agri"}]}, 
    {"id": 23, "loc": "Chicago", "st":"2", "sectors": [{"sname": "Banking"}, {"sname": "Agri"}]}, 
    {"id": 24, "loc": "Bern", "st":"1", "sectors": [{"sname": "Retail"}, {"sname": "Agri"}]}, 
    {"id": 25, "loc": "Bern", "st":"4", "sectors": [{"sname": "Retail"}, {"sname": "Agri"}, {"sname": "Banking"}]} 
    ]} 

byloc = lambda x: x['loc'] 

it = (
    (loc, list(user_grp)) 
    for loc, user_grp in itertools.groupby(
     sorted(dataset['users'], key=byloc), key=byloc 
    ) 
) 
fs_loc = [ 
    {'loc': loc, 'ids': [{'id':x['id'],'st':x['st']} for x in grp], 'count': len(grp)} 
    for loc, grp in it 
] 

print(fs_loc) 

fs_loc给我的ID和各自的ST值如下(连同ID数)的列表现在

[ 
    {"loc": "Chicago","count":2,"ids": [{"id":"20","st":"4"}, {"id":"23","st":"2"}]}, 
    {"loc": "Bern","count":2,"ids": [{"id":"24","st":"1"}, {"id":"25","st":"4"}]}, 
    {"loc": "Frankfurt","count":1,"ids": [{"id":"21","st":"4"}]}, 
    {"loc": "Berlin","count":1,"ids": [{"id":"21","st":"4"}]}  
] 

,我试图通过组从SNAME部门 - 我试过以下代码,它失败..无法弄清楚如何实现如下的结果 -

所需的结果:

[ 
    {"sname": "Retail","count":3,"ids": [{"id":"20","st":"4"}, {"id":"24","st":"1"}, {"id":"25","st":"4"}]}, 
    {"sname": "Manufacturing","count":2,"ids": [{"id":"20","st":"4"}, {"id":"22","st":"6"}]}, 
    {"sname": "Banking","count":2,"ids": [{"id":"22","st":"6"},{"id":"23","st":"2"}]}, 
    {"sname": "Agri","count":4,"ids": [{"id":"22","st":"6"},{"id":"23","st":"2"},{"id":"24","st":"1"},{"id":"25","st":"4"}]}  
] 

我尝试下面的代码,它不为嵌套表工作字典的键 -

bysname = lambda x: x['sectors'][0]['sname'] 

it = (
    (sname, list(user_grp)) 
    for sname, user_grp in itertools.groupby(
     sorted(dataset['users'], key=bysname), key=bysname 
    ) 
) 
fs_sname= [ 
    {'sname': sname, 'ids': [{'id':x['id'],'st':x['st']} for x in grp], 'count': len(grp)} 
    for sname, grp in it 
] 

print(fs_sname) 

编辑 - 上面的代码工作,但只考虑部门列表中的第一项。即,它给出以下结果 -

[ 
    {"sname": "","count":1,"ids": [{"id":"21","st":"4"}]}, 
    {"sname": "Manufacturing","count":1,"ids": [{"id":"22","st":"6"}]}, 
    {"sname": "Banking","count":1,"ids": [{"id":"23","st":"2"}]}, 
    {"sname": "Retail","count":3,"ids": [{"id":"20","st":"4"},{"id":"24","st":"1"},{"id":"25","st":"4"}]}  
] 

如何才能达到所期望的结果?

+0

我不明白你期望的结果是什么适应summarize功能。你能提供一个最小的例子吗? – timgeb

+0

我已经添加了它..你可以用chk突出显示吗? –

+0

什么是'null'? – timgeb

回答

1

这应该工作 - 根据需要

allsectornames = set(sec['sname'] for record in dataset['users'] for sec in record['sectors']) 

summarize = lambda record: record[ 'id' ] # customize this to return whatever details you want (even just return the whole record itself if you prefer) 

result = [ 
    { 
     'sname':sname, 
     'count':len(matches), 
     'matches':[ summarize(match) for match in matches ] 
    } 
    for sname in allsectornames 
    for matches in [[ 
     record for record in dataset['users'] if sname in [ sec['sname'] for sec in record['sectors'] ] 
    ]] 
] 

print(result) 
+0

非常感谢!我添加了st字段以获取id,st的集合 –