特定动态嵌套字典，自动实现实现

我试图以特定方式实现嵌套字典结构。我正在阅读一长串单词。这些词最终将需要通过经常有效的搜索，所以这是我想如何设置我的字典：特定动态嵌套字典，自动实现实现

我想做一个嵌套的字典结构，其中第一个键值是该词是一个字典，其中键是该词的第一个字母，该值是一个词典，该键是该词的第二个字母，该值是该词作为该词的第三个字母的词典等。 ..

所以如果我读了 “汽车”， “可以” 和 “乔”

我得到

{3: {c: {a: {r: car, n: can}}},j: {o: {e: joe}}}

虽然我需要为此处理大约100,000个字，并且它们的长度从2到27个字母不等。

我已经通过What is the best way to implement nested dictionaries? 和 Dynamic nested dictionaries看去。

但没有任何运气搞清楚这一点。

我肯定能得到我的话了使用

for word in text_file.read().split()

我的文本文件的，我可以用

for char in word

或

for i in range(len(word)): 
    word[i]

我只是”打入每个字符弄清楚如何让这个结构下来。任何帮助将不胜感激。

来源

2016-12-07 J. Smith

在你的例子中长度不应该是3吗？那么乔词典的关键是什么？ – Iluvatar

@Ivvatar是的！编辑显示3，谢谢。乔字典的关键是3，因为它是一个3个字母的单词。 –

你不能有两个相同的键。另外，你的目标是什么？像词建议？ – Iluvatar

并非是确保您的这种结构的目的是什么，这是一个使用递归来生成你描述的结构的解决方案：

from collections import defaultdict 
d = defaultdict(list) 
words = ['hello', 'world', 'hi'] 


def nest(d, word): 
    if word == "": 
     return d 
    d = {word[-1:]: word if d is None else d} 
    return nest(d, word[:-1]) 


for word in words: 
    l = len(word) 
    d[l].append(nest(None, word)) 

print(d)

来源

2016-12-07 01:36:00 antonagestam

这建立了字典列表的字典，不只是嵌套字典 - 这是什么OP想要。 – martineau

@martineau这个问题被编辑了，在这个问题上有一个列表。 – antonagestam

也许你应该[编辑]你的答案，并指出你正在回答问题的早期版本（并显示产生的字典/输出）。 – martineau

下面是关于如何实现与建立在defaultdict自动激活特里一个简短的例子。对于终止单词的每个节点，它都会存储额外的密钥term来指示它。

from collections import defaultdict 

trie = lambda: defaultdict(trie) 

def add_word(root, s): 
    node = root 
    for c in s: 
     node = node[c] 
    node['term'] = True 

def list_words(root, length, prefix=''): 
    if not length: 
     if 'term' in root: 
      yield prefix 
     return 

    for k, v in root.items(): 
     if k != 'term': 
      yield from list_words(v, length - 1, prefix + k) 

WORDS = ['cars', 'car', 'can', 'joe'] 
root = trie() 
for word in WORDS: 
    add_word(root, word) 

print('Length {}'.format(3)) 
print('\n'.join(list_words(root, 3))) 
print('Length {}'.format(4)) 
print('\n'.join(list_words(root, 4)))

输出：

Length 3 
joe 
can 
car 
Length 4 
cars

来源

2016-12-07 01:54:51 niemmi

这里有一个办法做到这一点不使用collections.defaultdict或创建的dict自己的自定义子类 - 所以产生的字典仅仅是一个普通的dict对象：

import pprint 

def _build_dict(wholeword, chars, val, dic): 
    if len(chars) == 1: 
     dic[chars[0]] = wholeword 
     return 
    new_dict = dic.get(chars[0], {}) 
    dic[chars[0]] = new_dict 
    _build_dict(wholeword, chars[1:], val, new_dict) 

def build_dict(words): 
    dic = {} 
    for word in words: 
     root = dic.setdefault(len(word), {}) 
     _build_dict(word, list(word), word[1:], root) 
    return dic 

words = ['a', 'ox', 'car', 'can', 'joe'] 
data_dict = build_dict(words) 
pprint.pprint(data_dict)

输出：

{1: {'a': 'a'}, 
2: {'o': {'x': 'ox'}}, 
3: {'c': {'a': {'n': 'can', 'r': 'car'}}, 'j': {'o': {'e': 'joe'}}}}

它基于python.org中的消息中所示的递归算法Python列表标题为Building and Transvering multi-level dictionaries的归档主题。

来源

2016-12-07 03:21:04 martineau

特定动态嵌套字典，自动实现实现

回答

相关问题