python创建嵌套字典_python-从拼合字典创建嵌套字典-CSDN博客

本文链接：https://blog.csdn.net/weixin_39559071/article/details/111453501

这篇博客讨论了如何将一个平面字典转换为嵌套字典的Python方法。博主面临的问题是如何将键值对以_X_a_one等形式存储的平面字典转化为{'X': {'a': {'one': 10}}}这样的嵌套结构。博客中提供了多种解决方案，包括递归和非递归方法，并展示了代码实现。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

python-从拼合字典创建嵌套字典

我有一个拼合的字典，我想做成一个嵌套的字典，形式如下

flat = {'X_a_one': 10,

'X_a_two': 20,

'X_b_one': 10,

'X_b_two': 20,

'Y_a_one': 10,

'Y_a_two': 20,

'Y_b_one': 10,

'Y_b_two': 20}

我想将其转换为表格

nested = {'X': {'a': {'one': 10,

'two': 20},

'b': {'one': 10,

'two': 20}},

'Y': {'a': {'one': 10,

'two': 20},

'b': {'one': 10,

'two': 20}}}

平面字典的结构应避免歧义出现任何问题。我希望它适用于任意深度的字典，但是性能并不是真正的问题。我见过许多用于平整嵌套字典的方法，但是基本上没有用于嵌套平整字典的方法。存储在字典中的值是标量或字符串，永远不可迭代。

到目前为止，我有一些可以接受的东西

test_dict = {'X_a_one': '10',

'X_b_one': '10',

'X_c_one': '10'}

到输出

test_out = {'X': {'a_one': '10',

'b_one': '10',

'c_one': '10'}}

使用代码

def nest_once(inp_dict):

out = {}

if isinstance(inp_dict, dict):

for key, val in inp_dict.items():

if '_' in key:

head, tail = key.split('_', 1)

if head not in out.keys():

out[head] = {tail: val}

else:

out[head].update({tail: val})

else:

out[key] = val

return out

test_out = nest_once(test_dict)

但是我在弄清楚如何使它变成递归地创建字典的所有级别的东西时遇到了麻烦。

任何帮助，将不胜感激！

(关于为什么要这样做：我有一个结构等同于嵌套字典的文件，我想将此文件的内容存储在NetCDF文件的属性字典中，以后再检索。但是，NetCDF仅允许您将平面字典作为属性，因此我想展开以前存储在NetCDF文件中的字典。)

7个解决方案

25 votes

这是我的看法：

def nest_dict(flat):

result = {}

for k, v in flat.items():

_nest_dict_rec(k, v, result)

return result

def _nest_dict_rec(k, v, out):

k, *rest = k.split('_', 1)

if rest:

_nest_dict_rec(rest[0], v, out.setdefault(k, {}))

else:

out[k] = v

flat = {'X_a_one': 10,

'X_a_two': 20,

'X_b_one': 10,

'X_b_two': 20,

'Y_a_one': 10,

'Y_a_two': 20,

'Y_b_one': 10,

'Y_b_two': 20}

nested = {'X': {'a': {'one': 10,

'two': 20},

'b': {'one': 10,

'two': 20}},

'Y': {'a': {'one': 10,

'two': 20},

'b': {'one': 10,

'two': 20}}}

print(nest_dict(flat) == nested)

# True

jdehesa answered 2020-02-04T19:20:27Z

24 votes

output = {}

for k, v in source.items():

# always start at the root.

current = output

# This is the part you're struggling with.

pieces = k.split('_')

# iterate from the beginning until the second to last place

for piece in pieces[:-1]:

if not piece in current:

# if a dict doesn't exist at an index, then create one

current[piece] = {}

# as you walk into the structure, update your current location

current = current[piece]

# The reason you're using the second to last is because the last place

# represents the place you're actually storing the item

current[pieces[-1]] = v

cwallenpoole answered 2020-02-04T19:20:43Z

13 votes

这是使用defaultdict的一种方法，是从先前的答案中大量借用的。共有3个步骤：

创建dict对象的嵌套defaultdict。

迭代defaultdict输入字典中的项目。

根据dict对键进行拆分所得到的结构，使用getFromDict迭代结果字典，构建defaultdict结果。

这是一个完整的示例：

from collections import defaultdict

from functools import reduce

from operator import getitem

def getFromDict(dataDict, mapList):

"""Iterate nested dictionary"""

return reduce(getitem, mapList, dataDict)

# instantiate nested defaultdict of defaultdicts

tree = lambda: defaultdict(tree)

d = tree()

# iterate input dictionary

for k, v in flat.items():

*keys, final_key = k.split('_')

getFromDict(d, keys)[final_key] = v

{'X': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}},

'Y': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}}}

最后一步，您可以将defaultdict转换为常规dict，尽管通常无需执行此步骤。

def default_to_regular_dict(d):

"""Convert nested defaultdict to regular dict of dicts."""

if isinstance(d, defaultdict):

d = {k: default_to_regular_dict(v) for k, v in d.items()}

return d

# convert back to regular dict

res = default_to_regular_dict(d)

jpp answered 2020-02-04T19:21:25Z

4 votes

其他答案更简洁，但是由于您提到了递归，所以我们确实有其他选择。

def nest(d):

_ = {}

for k in d:

i = k.find('_')

if i == -1:

_[k] = d[k]

continue

s, t = k[:i], k[i+1:]

if s in _:

_[s][t] = d[k]

else:

_[s] = {t:d[k]}

return {k:(nest(_[k]) if type(_[k])==type(d) else _[k]) for k in _}

Hans Musgrave answered 2020-02-04T19:21:45Z

4 votes

您可以使用itertools.groupby：

import itertools, json

flat = {'Y_a_two': 20, 'Y_a_one': 10, 'X_b_two': 20, 'X_b_one': 10, 'X_a_one': 10, 'X_a_two': 20, 'Y_b_two': 20, 'Y_b_one': 10}

_flat = [[*a.split('_'), b] for a, b in flat.items()]

def create_dict(d):

_d = {a:list(b) for a, b in itertools.groupby(sorted(d, key=lambda x:x[0]), key=lambda x:x[0])}

return {a:create_dict([i[1:] for i in b]) if len(b) > 1 else b[0][-1] for a, b in _d.items()}

print(json.dumps(create_dict(_flat), indent=3))

输出：

{

"Y": {

"b": {

"two": 20,

"one": 10

"a": {

"two": 20,

"one": 10

}

"X": {

"b": {

"two": 20,

"one": 10

"a": {

"two": 20,

"one": 10

}

Ajax1234 answered 2020-02-04T19:22:09Z

4 votes

另一个没有导入的非递归解决方案。在插入扁平字典的每个键值对与映射扁平字典的键值对之间进行逻辑划分。

def insert(dct, lst):

"""

dct: a dict to be modified inplace.

lst: list of elements representing a hierarchy of keys

followed by a value.

dct = {}

lst = [1, 2, 3]

resulting value of dct: {1: {2: 3}}

"""

for x in lst[:-2]:

dct[x] = dct = dct.get(x, dict())

dct.update({lst[-2]: lst[-1]})

def unflat(dct):

# empty dict to store the result

result = dict()

# create an iterator of lists representing hierarchical indices followed by the value

lsts = ([*k.split("_"), v] for k, v in dct.items())

# insert each list into the result

for lst in lsts:

insert(result, lst)

return result

result = unflat(flat)

# {'X': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}},

# 'Y': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}}}

hilberts_drinking_problem answered 2020-02-04T19:22:29Z

1 votes

这是一个合理可读的递归结果：

def unflatten_dict(a, result=None, sep='_'):

if result is None:

result = dict()

for k, v in a.items():

k, *rest = k.split(sep, 1)

if rest:

unflatten_dict({rest[0]: v}, result.setdefault(k, {}), sep=sep)

else:

result[k] = v

return result

flat = {'X_a_one': 10,

'X_a_two': 20,

'X_b_one': 10,

'X_b_two': 20,

'Y_a_one': 10,

'Y_a_two': 20,

'Y_b_one': 10,

'Y_b_two': 20}

print(unflatten_dict(flat))

{'X': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}},

'Y': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}}}

这是基于以上几个答案，不使用任何导入，仅在python 3中进行了测试。

makeyourownmaker answered 2020-02-04T19:22:54Z