如何将字典作为值插入Python中使用循环的字典

我目前面临一个问题，使我的CVS数据字典。如何将字典作为值插入Python中使用循环的字典

我有3列，我想在文件中使用：

userID, placeID, rating 
U1000, 12222, 3 
U1000, 13333, 2 
U1001, 13333, 4

我想作的结果是这样的：

{'U1000': {'12222': 3, '13333': 2}, 
'U1001': {'13333': 4}}

也就是说，我想使我的数据结构看起来像：

sample = {} 
sample["U1000"] = {} 
sample["U1001"] = {} 
sample["U1000"]["12222"] = 3 
sample["U1000"]["13333"] = 2 
sample["U1001"]["13333"] = 4

但我有很多数据是亲cessed。我想获得与循环的结果，但我已经尝试过了2小时，失败..

---以下代码可以迷惑你---

我的结果看现在这个样子：

{'U1000': ['12222', 3], 
'U1001': ['13333', 4]}

该字典的值是一个列表，而一本字典
用户“U1000”出现多次，但在我孤单的结果只有一次

我想我的代码有很多错误..如果你不介意的话，请看看：

reader = np.array(pd.read_csv("rating_final.csv")) 
included_cols = [0, 1, 2] 

sample= {} 
target=[] 
target1 =[] 
for row in reader: 
     content = list(row[i] for i in included_cols) 
     target.append(content[0]) 
     target1.append(content[1:3]) 

sample = dict(zip(target, target1))

我怎么能提高代码？我已经看过通过计算器，但由于个人缺乏能力，任何人都可以请帮助我呢？

非常感谢！

来源

2016-03-02 Leigh Tsai

这似乎是你想要的字典作为_values_ ，而不是_keys_。也许正确的标题匹配？ – ShadowRanger

谢谢你的提醒。已更正标题以及内容！ –

另外，你的例子有'{'U1000'：{'12222'：3}，{'1333'：2}，'U1001'：{'13333'：4}}'，但是这是'U1000'和' U1001'，但没有与{{1333'：2}'相关联的键（或无值）。你可以有'{'U1000'：{'12222'：3，'1333'：2}，'U1001'：{'13333'：4}}'或'{'U1000'：[{'12222'： 3}，{'1333'：2}]，'U1001'：[{'13333'：4}]}'，但不是你提供的。 – ShadowRanger

这应该做你想要什么：

import collections 

reader = ... 
sample = collections.defaultdict(dict) 

for user_id, place_id, rating in reader: 
    rating = int(rating) 
    sample[user_id][place_id] = rating 

print(sample) 
# -> {'U1000': {'12222': 3, '1333': 2}, 'U1001': {'13333': 4}}

defaultdict是一个方便的工具，只要您试图访问一个关键，是不是在字典中提供的默认值。如果你（因为你要sample['non-existent-user-id]失败，KeyError例如）不喜欢它，使用：

reader = ... 
sample = {} 

for user_id, place_id, rating in reader: 
    rating = int(rating) 
    if user_id not in sample: 
     sample[user_id] = {} 
    sample[user_id][place_id] = rating

来源

2016-03-02 18:16:35

感谢您的澄清，这真的有帮助！ –

例子中的预期输出是不可能的，因为{'1333': 2}不会与一个键关联。你可以得到{'U1000': {'12222': 3, '1333': 2}, 'U1001': {'13333': 4}}虽然与dict的dict一个S：

sample = {} 
for row in reader: 
    userID, placeID, rating = row[:3] 
    sample.setdefault(userID, {})[placeID] = rating # Possibly int(rating)?

或者，使用collections.defaultdict(dict)以避免涉及setdefault（或其他方法需要一个try/except KeyError或if userID in sample:在交换牺牲setdefault的原子为不产生空dict小号不必要地）：

import collections 

sample = collections.defaultdict(dict) 
for row in reader: 
    userID, placeID, rating = row[:3] 
    sample[userID][placeID] = rating 

# Optional conversion back to plain dict 
sample = dict(sample)

转换回普通dict确保将来升ookups不会自动生动化按键，正常情况下会提升KeyError，如果您print那么它看起来像正常的dict。

如果included_cols是很重要的（因为名字或列索引可能会发生变化），则可以使用operator.itemgetter加快和简化一次提取所有所需的列：

from collections import defaultdict 
from operator import itemgetter 

included_cols = (0, 1, 2) 
# If columns in data were actually: 
# rating, foo, bar, userID, placeID 
# we'd do this instead, itemgetter will handle all the rest: 
# included_cols = (3, 4, 0) 
get_cols = itemgetter(*included_cols) # Create function to get needed indices at once 

sample = defaultdict(dict) 
# map(get_cols, ...) efficiently converts each row to a tuple of just 
# the three desired values as it goes, which also lets us unpack directly 
# in the for loop, simplifying code even more by naming all variables directly 
for userID, placeID, rating in map(get_cols, reader): 
    sample[userID][placeID] = rating # Possibly int(rating)?

来源

2016-03-02 18:22:06 ShadowRanger

感谢您的回答，这真的有帮助！ –

如何将字典作为值插入Python中使用循环的字典

回答

相关问题