PEP 584 – Add Union Operators To dict

PEP 584 – Add Union Operators To dict

PEP 584 – 给dict添加合并操作符

原文地址:https://www.python.org/dev/peps/pep-0584/

PEP:584
Title:Add Union Operators To dict
Author:Steven D’Aprano

Contents

Abstract

摘要

This PEP proposes adding merge (|) and update (|=) operators to the built-in dict class.

本PEP建议在内置的dict类中增加合并(|)和更新(|=)操作符。

Note

注意

After this PEP was accepted, the decision was made to also implement the new operators for several other standard library mappings.

在这个PEP被接受之后,我们决定也为其他几个标准库的映射实现新的操作符。

Motivation

动机

The current ways to merge two dicts have several disadvantages:

目前合并两个字典的方法有几个缺点。

dict.update

d1.update(d2) modifies d1 in-place. e = d1.copy(); e.update(d2) is not an expression and needs a temporary variable.

d1.update(d2)就地修改d1e = d1.copy(); e.update(d2)不是一个表达式,需要一个临时变量。

{**d1, **d2}

Dict unpacking looks ugly and is not easily discoverable. Few people would be able to guess what it means the first time they see it, or think of it as the “obvious way” to merge two dicts.

Dict解包看起来很丑,而且不容易被发现。很少有人能在第一次看到它的时候就猜到它的意思,或者认为它是合并两个字典的 “明显方法”。

As Guido said:

就像Guido说的那样:

I’m sorry for PEP 448, but even if you know about **d in simpler contexts, if you were to ask a typical Python user how to combine two dicts into a new one, I doubt many people would think of {**d1, **d2}. I know I myself had forgotten about it when this thread started!

我为 PEP 448 感到抱歉,但即使你在更简单的上下文中知道 **d,如果你要问一个典型的 Python 用户如何将两个 dicts 合并成一个新的,我怀疑是否有人会想到{**d1, **d2}。我知道当这个主题开始的时候,我自己已经忘记了它!

{**d1, **d2} ignores the types of the mappings and always returns a dict. type(d1)({**d1, **d2}) fails for dict subclasses such as defaultdict that have an incompatible __init__ method.

{**d1, **d2}忽略映射的类型,并且总是返回一个dict.type(d1)({**d1, **d2})类型,如果包含的是dict的子类,比如是defaultdict这样有不兼容的__init__方法的子类,将会失败。

collections.ChainMap

ChainMap is unfortunately poorly-known and doesn’t qualify as “obvious”. It also resolves duplicate keys in the opposite order to that expected (“first seen wins” instead of “last seen wins”). Like dict unpacking, it is tricky to get it to honor the desired subclass. For the same reason, type(d1)(ChainMap(d2, d1)) fails for some subclasses of dict.

不幸的是,ChainMap鲜为人知,不符合 "明显 "的条件。它还以与预期相反的顺序解决重复的键("最先看到的赢 "而不是 “最后看到的赢”)。就像dict解包一样,要让它尊重所需的子类是很棘手的。出于同样的原因,type(d1)(ChainMap(d2, d1))对于dict的某些子类来说是失败的。

Further, ChainMaps wrap their underlying dicts, so writes to the ChainMap will modify the original dict:

此外,ChainMaps封装了它们的底层数据,所以对ChainMap的写入将修改原始数据:

>>> d1 = {'spam': 1}
>>> d2 = {'eggs': 2}
>>> merged = ChainMap(d2, d1)
>>> merged['eggs'] = 999
>>> d2
{'eggs': 999}

dict(d1, **d2)

This “neat trick” is not well-known, and only works when d2 is entirely string-keyed:

这个 "巧妙的技巧 "并不为人所知,而且只在d2完全是字符串键的情况下才有效:

>>> d1 = {"spam": 1}
>>> d2 = {3665: 2}
>>> dict(d1, **d2)
Traceback (most recent call last):
  ...
TypeError: keywords must be strings

Rationale

理论依据

The new operators will have the same relationship to the dict.update method as the list concatenate (+) and extend (+=) operators have to list.extend. Note that this is somewhat different from the relationship that |/|= have with set.update; the authors have determined that allowing the in-place operator to accept a wider range of types (as list does) is a more useful design, and that restricting the types of the binary operator’s operands (again, as list does) will help avoid silent errors caused by complicated implicit type casting on both sides.

新的操作符与dict.update方法的关系,与list concatenate (+)和extend (+=)操作符与list.extend的关系相同。请注意,这与|/|=set.update的关系有些不同;作者认为,允许就地操作符接受更广泛的类型(就像list那样)是一个更有用的设计,而限制二进制操作符的操作数的类型(同样,就像list那样)将有助于避免因双方复杂的隐式类型转换而导致的沉默错误。

Key conflicts will be resolved by keeping the rightmost value. This matches the existing behavior of similar dict operations, where the last seen value always wins:

键的冲突将通过保留最右边的值来解决。这与现有的类似dict操作的行为相匹配,即最后看到的值总是获胜。

{'a': 1, 'a': 2}
{**d, **e}
d.update(e)
d[k] = v
{k: v for x in (d, e) for (k, v) in x.items()}

All of the above follow the same rule. This PEP takes the position that this behavior is simple, obvious, usually the behavior we want, and should be the default behavior for dicts. This means that dict union is not commutative; in general d | e != e | d.

上述所有的行为都遵循同样的规则。本PEP的立场是,这种行为是简单、明显的,通常是我们想要的行为,并且应该是dict的默认行为。这意味着dict联盟不是换元的;一般来说,d | e != e | d

Similarly, the iteration order of the key-value pairs in the dictionary will follow the same semantics as the examples above, with each newly added key (and its value) being appended to the current sequence.

同样,字典中键值对的迭代顺序将遵循与上面的例子相同的语义,每一个新添加的键(和它的值)都被追加到当前的序列中。

因为Python中字典的实现基于散列,随着字典容量的扩充很可能会触发重建机制,进而改变字典的迭代顺序,所以应该将字典视为一种无序容器,不能依赖于字典的迭代顺序,所以这里不应对字典的迭代顺序纠结。想了解原因和散列结构的,可以阅读Python学习笔记20:字典与集合,译者注。

Specification

定义

Dict union will return a new dict consisting of the left operand merged with the right operand, each of which must be a dict (or an instance of a dict subclass). If a key appears in both operands, the last-seen value (i.e. that from the right-hand operand) wins:

Dict 合并运算将返回一个由左操作数和右操作数组成的新的dict,每个操作数必须是一个dict(或dict子类的一个实例)。如果一个键出现在两个操作数中,最后看到的值(即来自右侧操作数的值)获胜。

>>> d = {'spam': 1, 'eggs': 2, 'cheese': 3}
>>> e = {'cheese': 'cheddar', 'aardvark': 'Ethel'}
>>> d | e
{'spam': 1, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}
>>> e | d
{'cheese': 3, 'aardvark': 'Ethel', 'spam': 1, 'eggs': 2}

The augmented assignment version operates in-place:

增强的赋值版本在原地操作:

>>> d |= e
>>> d
{'spam': 1, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}

Augmented assignment behaves identically to the update method called with a single positional argument, so it also accepts anything implementing the Mapping protocol (more specifically, anything with the keys and __getitem__ methods) or iterables of key-value pairs. This is analogous to list += and list.extend, which accept any iterable, not just lists. Continued from above:

增强赋值的行为与使用单一位置参数调用的 update 方法相同,所以它也接受任何实现 Mapping 协议的东西(更具体地说,任何具有 keys__getitem__ 方法的东西)或键值对的迭代物。这类似于 list +=list.extend,它们接受任何可迭代的东西,而不仅仅是列表。继续上面的内容:

>>> d | [('spam', 999)]
Traceback (most recent call last):
  ...
TypeError: can only merge dict (not "list") to dict

>>> d |= [('spam', 999)]
>>> d
{'spam': 999, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}

When new keys are added, their order matches their order within the right-hand mapping, if any exists for its type.

当新的键被添加时,它们的顺序与它们在右侧映射中的顺序一致,如果它的类型存在的话。

Reference Implementation

参考实现

One of the authors has written a C implementation.

其中一位作者已经写了一个C实现

An approximate pure-Python implementation is:

一个近似的纯Python实现是:

def __or__(self, other):
    if not isinstance(other, dict):
        return NotImplemented
    new = dict(self)
    new.update(other)
    return new

def __ror__(self, other):
    if not isinstance(other, dict):
        return NotImplemented
    new = dict(other)
    new.update(self)
    return new

def __ior__(self, other):
    dict.update(self, other)
    return self

Major Objections

主要反对意见

Dict Union Is Not Commutative

字典的合并运算不是换元的

Union is commutative, but dict union will not be (d | e != e | d).

合并运算是换元的,但字典合并运算不是(d | e != e | d)。

Response

回复

There is precedent for non-commutative unions in Python:

在Python中,有非交换性合并运算的先例。

>>> {0} | {False}
{0}
>>> {False} | {0}
{False}

While the results may be equal, they are distinctly different. In general, a | b is not the same operation as b | a.

虽然结果可能是相等的,但它们是明显不同的。一般来说,a | bb | a不是同一个操作。

Dict Union Will Be Inefficient

字典合并运算将是低效的

Giving a pipe operator to mappings is an invitation to writing code that doesn’t scale well. Repeated dict union is inefficient: d | e | f | g | h creates and destroys three temporary mappings.

给映射提供一个管道操作符是对写代码的一种诱惑,它不能很好地扩展。重复的dict合并操作是低效的:d | e | f | g | h创建并销毁了三个临时映射。

Response

回复

The same argument applies to sequence concatenation.

同样的论点也适用于序列连接。

Sequence concatenation grows with the total number of items in the sequences, leading to O(N**2) (quadratic) performance. Dict union is likely to involve duplicate keys, so the temporary mappings will not grow as fast.

序列连接会随着序列中项目总数的增加而增长,导致O(N**2)(平方)的性能。Dict合并操作可能涉及重复的键,所以临时映射不会增长得那么快。

Just as it is rare for people to concatenate large numbers of lists or tuples, the authors of this PEP believe that it will be rare for people to merge large numbers of dicts. collections.Counter is a dict subclass that supports many operators, and there are no known examples of people having performance issues due to combining large numbers of Counters. Further, a survey of the standard library by the authors found no examples of merging more than two dicts, so this is unlikely to be a performance problem in practice… “Everything is fast for small enough N”.

就像人们很少连接大量的列表或图元一样,本PEP的作者认为人们很少会合并大量的dict。collections.Counter是一个支持许多操作符的dict子类,而且没有已知的例子表明人们会因为合并大量的Counter而产生性能问题。此外,作者对标准库的调查没有发现合并两个以上的dict的例子,所以这在实践中不太可能成为一个性能问题… “对于足够小的N来说,一切都很快速”。

If one expects to be merging a large number of dicts where performance is an issue, it may be better to use an explicit loop and in-place merging:

如果一个人预计要合并大量的数据,而性能又是一个问题,那么使用一个显式循环和就地合并可能会更好:

new = {}
for d in many_dicts:
    new |= d

Dict Union Is Lossy

字典合并操作是有损失的

Dict union can lose data (values may disappear); no other form of union is lossy.

Dict合并操作可能会丢失数据(值可能会消失);没有其他形式的结合操作是有损失的。

Response

回复

It isn’t clear why the first part of this argument is a problem. dict.update() may throw away values, but not keys; that is expected behavior, and will remain expected behavior regardless of whether it is spelled as update() or |.

dict.update()可以丢弃值,但不能丢弃键;这是预期的行为,而且无论它被拼成update()还是|,都是预期的行为。

Other types of union are also lossy, in the sense of not being reversible; you cannot get back the two operands given only the union. a | b == 365… what are a and b?

其他类型的合并操作也是有损失的,也就是不可逆的;你不能只通过合并操作来取回两个操作数。a | b == 365… 哪个是 a 哪个是 b

Only One Way To Do It

只有一种方法可以做到

Dict union will violate the Only One Way koan from the Zen.

Dict合并操作将违反Python之禅的 "唯一途径 "原则。

Response

回复

There is no such koan. “Only One Way” is a calumny about Python originating long ago from the Perl community.

没有这样的原则。"只有一条路 "是对 Python 的诽谤,很久以前来自 Perl 社区。

More Than One Way To Do It

不止一种方法可以做到这一点

Okay, the Zen doesn’t say that there should be Only One Way To Do It. But it does have a prohibition against allowing “more than one way to do it”.

好吧,Python之禅并没有说应该有唯一的方法。但它确实禁止允许 “不止一种方法”。

Response

回复

There is no such prohibition. The “Zen of Python” merely expresses a preference for “only one obvious way”:

没有这样的禁条。Python之禅 "只是表达了对 "只有一种明显的方式 "的偏好:

There should be one-- and preferably only one --obvious way to do
it.

The emphasis here is that there should be an obvious way to do “it”. In the case of dict update operations, there are at least two different operations that we might wish to do:

这里强调的是,应该有一个明显的方法来做 “它”。在dict更新操作的情况下,至少有两种不同的操作是我们可能希望做的:

  • Update a dict in place: The Obvious Way is to use the update() method. If this proposal is accepted, the |= augmented assignment operator will also work, but that is a side-effect of how augmented assignments are defined. Which you choose is a matter of taste.

    就地更新一个dict。明显的方法是使用update()方法。如果这个建议被接受,|=增强的赋值运算符也会起作用,但这只是在增强赋值被定义后提供了一种额外途径。你选择哪一个是一个品味的问题。

  • Merge two existing dicts into a third, new dict: This PEP proposes that the Obvious Way is to use the | merge operator.

    将两个现有的数据合并成第三个新的数据:本PEP提出,最明显的方法是使用|合并操作符。

In practice, this preference for “only one way” is frequently violated in Python. For example, every for loop could be re-written as a while loop; every if block could be written as an if/ else block. List, set and dict comprehensions could all be replaced by generator expressions. Lists offer no fewer than five ways to implement concatenation:

在实践中,这种对 "只有一种方式 "的偏好在Python中经常被违反。例如,每个 for 循环都可以被改写成 while循环;每个 if块都可以写成 if/else块。List、set和dict的表达式都可以被生成器表达式所取代。列表提供了不少于五种实现连接的方法:

  • Concatenation operator: a + b
  • In-place concatenation operator: a += b
  • Slice assignment: a[len(a):] = b
  • Sequence unpacking: [*a, *b]
  • Extend method: a.extend(b)

We should not be too strict about rejecting useful functionality because it violates “only one way”.

我们不应过于严格地拒绝有用的功能,只是因为它违反了 “只有一种方式”。

Dict Union Makes Code Harder To Understand

字典合并操作使得代码更难以理解

Dict union makes it harder to tell what code means. To paraphrase the objection rather than quote anyone in specific: “If I see spam | eggs, I can’t tell what it does unless I know what spam and eggs are”.

Dict合并操作使人们更难分辨代码的含义。解释一下这个反对意见,而不是具体引用谁的话:“如果我看到spam | eggs,我无法知道它的作用,除非我知道spameggs是什么”。

Response

回复

This is very true. But it is equally true today, where the use of the | operator could mean any of:

这是非常正确的。但今天也同样如此,在这里,使用|运算符可能意味着任何:

  • int/bool bitwise-or

    int/bool位或

  • set/frozenset union

    set/frozenset并集

  • any other overloaded operation

    任何其他被重载了的操作

Adding dict union to the set of possibilities doesn’t seem to make it harder to understand the code. No more work is required to determine that spam and eggs are mappings than it would take to determine that they are sets, or integers. And good naming conventions will help:

将dict 合并操作添加到可能性的集合中,似乎并没有使代码更难理解。确定 spameggs是映射的工作并不比确定它们是集合或整数的工作多。好的命名规则会有帮助:

flags |= WRITEABLE  # Probably numeric bitwise-or.
DO_NOT_RUN = WEEKENDS | HOLIDAYS  # Probably set union.
settings = DEFAULT_SETTINGS | user_settings | workspace_settings  # Probably dict union.

What About The Full set API?

关于完整的setAPI?

dicts are “set like”, and should support the full collection of set operators: |, &, ^, and -.

dicts是 "类似于集合 "的,应该支持合并运算符的全部合并运算:|, &, ^, 和 -.

Response

回应

This PEP does not take a position on whether dicts should support the full collection of set operators, and would prefer to leave that for a later PEP (one of the authors is interested in drafting such a PEP). For the benefit of any later PEP, a brief summary follows.

本PEP并不对dict是否应该支持全部的合并运算符进行表态,而是希望把这个问题留给以后的PEP(作者之一有兴趣起草这样一个PEP)。为了方便以后的PEP,下面是一个简短的总结。

Set symmetric difference (^) is obvious and natural. For example, given two dicts:

集合对称差分(^)是很明显且自然的。例如,给定两个数据:

d1 = {"spam": 1, "eggs": 2}
d2 = {"ham": 3, "eggs": 4}

the symmetric difference d1 ^ d2 would be {"spam": 1, "ham": 3}.

对称差值 d1 ^ d2将是 {"spam": 1, "ham": 3}.

Set difference (-) is also obvious and natural, and an earlier version of this PEP included it in the proposal. Given the dicts above, we would have d1 - d2 be {"spam": 1} and d2 - d1 be {"ham": 3}.

集合差分(-)也是很明显和自然的,本PEP的早期版本将其纳入建议。考虑到上面的数据,我们会让d1 - d2成为{"spam": 1}d2 - d1{"ham": 3}.

Set intersection (&) is a bit more problematic. While it is easy to determine the intersection of keys in two dicts, it is not clear what to do with the values. Given the two dicts above, it is obvious that the only key of d1 & d2 must be "eggs". “Last seen wins”, however, has the advantage of consistency with other dict operations (and the proposed union operators).

集合相交(&)的问题更大一些。虽然很容易确定两个数据集的键的交集,但不清楚该如何处理这些值。鉴于上面的两个数据集,很明显,d1 & d2的唯一键必须是 "eggs"。然而,"最后看到的胜出者 "具有与其他dict操作(以及提议的合并操作)一致的优势。

What About Mapping And MutableMapping?

如何处理MappingMutableMapping

collections.abc.Mapping and collections.abc.MutableMapping should define | and |=, so subclasses could just inherit the new operators instead of having to define them.

collections.abc.Mappingcollections.abc.MutableMapping应该定义||=,所以子类可以直接继承新的操作符,而不必定义它们。

Response

回应

There are two primary reasons why adding the new operators to these classes would be problematic:

有两个主要的原因,将新的运算符添加到这些类中会有问题:

  • Currently, neither defines a copy method, which would be necessary for | to create a new instance.

    目前,两者都没有定义copy方法,而那是使用|创建新实例的必要条件。

  • Adding |= to MutableMapping (or a copy method to Mapping) would create compatibility issues for virtual subclasses.

    MutableMapping中添加|=(或在Mapping中添加copy方法)会给虚拟子类带来兼容性问题。

Rejected Ideas

被拒绝的想法

Rejected Semantics

拒绝的语义

There were at least four other proposed solutions for handling conflicting keys. These alternatives are left to subclasses of dict.

至少还有其他四种处理冲突键的建议方案。这些方案留给dict的子类。

Raise

作为异常抛出

It isn’t clear that this behavior has many use-cases or will be often useful, but it will likely be annoying as any use of the dict union operator would have to be guarded with a try/except clause.

目前还不清楚这种行为是否有很多用处,或者是否经常有用,但它可能会很烦人,因为任何对dict合并运算符的使用都必须用try/except子句来嵌套。

Add The Values (As Counter Does, with +)

值相加(如Counter用+做的那样)

Too specialised to be used as the default behavior.

太过特殊,不能作为默认行为使用。

Leftmost Value (First-Seen) Wins

让左边的值(首先看到的)赢

It isn’t clear that this behavior has many use-cases. In fact, one can simply reverse the order of the arguments:

这种行为是否有很多用处并不清楚。事实上,我们可以简单地扭转参数的顺序:

d2 | d1  # d1 merged with d2, keeping existing values in d1

Concatenate Values In A List

连接列表中的值

This is likely to be too specialised to be the default. It is not clear what to do if the values are already lists:

这可能太特殊了,不能作为默认情况。如果值已经是列表,不清楚该怎么做。

{'a': [1, 2]} | {'a': [3, 4]}

Should this give {'a': [1, 2, 3, 4]} or {'a': [[1, 2], [3, 4]]}?

结果应该是 {'a': [1, 2, 3, 4]} 还是 {'a': [[1, 2], [3, 4]]}?

Rejected Alternatives

被拒绝的替代方案

Use The Addition Operator

使用加法运算符

This PEP originally started life as a proposal for dict addition, using the + and += operator. That choice proved to be exceedingly controversial, with many people having serious objections to the choice of operator. For details, see previous versions of the PEP and the mailing list discussions.

这个PEP最初是作为一个使用 ++=操作符的字典加法的提案。这一选择被证明是非常有争议的,许多人对这个操作符的选择有强烈的反对意见。详情请见PEP的先前版本和邮件列表讨论

Use The Left Shift Operator

使用左移运算符

The << operator didn’t seem to get much support on Python-Ideas, but no major objections either. Perhaps the strongest objection was Chris Angelico’s comment

在Python-Ideas上,<<操作符似乎没有得到多少支持,但也没有大的反对意见。也许最强烈的反对意见是Chris Angelico的评论

The “cuteness” value of abusing the operator to indicate information flow got old shortly after C++ did it.

滥用运算符来表示信息流的 "可爱 "价值在C++做完后不久就过时了。

Use A New Left Arrow Operator

使用一个新的左箭头运算符

Another suggestion was to create a new operator <-. Unfortunately this would be ambiguous, d <- e could mean d merge e or d less-than minus e.

另一个建议是创建一个新的操作符<-。不幸的是,这将会产生歧义,d <- e可能意味着d merge ed less-than minus e

Use A Method

使用一个方法

A dict.merged() method would avoid the need for an operator at all. One subtlety is that it would likely need slightly different implementations when called as an unbound method versus as a bound method.

As an unbound method, the behavior could be similar to:

一个dict.merged()方法将完全避免对操作符的需求。一个微妙的问题是,当作为一个非绑定方法和一个绑定方法被调用时,它可能需要稍微不同的实现。

def merged(cls, *mappings, **kw):
    new = cls()  # Will this work for defaultdict?
    for m in mappings:
        new.update(m)
    new.update(kw)
    return new

As a bound method, the behavior could be similar to:

作为一个绑定方法,其行为可能类似于:

def merged(self, *mappings, **kw):
    new = self.copy()
    for m in mappings:
        new.update(m)
    new.update(kw)
    return new
Advantages

有点

  • Arguably, methods are more discoverable than operators.

    可以说,方法比运算符更容易识别。

  • The method could accept any number of positional and keyword arguments, avoiding the inefficiency of creating temporary dicts.

    方法可以接受任何数量的位置参数和关键字参数,避免了创建临时数据的低效率。

  • Accepts sequences of (key, value) pairs like the update method.

    update方法一样接受(key, value)对的序列。

  • Being a method, it is easy to override in a subclass if you need alternative behaviors such as “first wins”, “unique keys”, etc.

    作为一个方法,如果你需要其他的行为,如 “first wins”、"unique keys "等,它很容易在子类中被覆盖。

Disadvantages

缺点

  • Would likely require a new kind of method decorator which combined the behavior of regular instance methods and classmethod. It would need to be public (but not necessarily a builtin) for those needing to override the method. There is a proof of concept.

    这可能需要一种新的方法装饰器,结合常规实例方法和类方法的行为。对于那些需要覆盖该方法的人来说,它需要是公开的(但不一定是内置的)。这里有一个概念证明。

  • It isn’t an operator. Guido discusses why operators are useful. For another viewpoint, see Nick Coghlan’s blog post.

    它不是一个操作符。Guido讨论了为什么运算符是有用的。另一个观点,见Nick Coghlan的博文。

Use a Function

使用一个函数

Instead of a method, use a new built-in function merged(). One possible implementation could be something like this:

用一个新的内置函数merged()来代替方法。一个可能的实现可以是这样的:

def merged(*mappings, **kw):
    if mappings and isinstance(mappings[0], dict):
        # If the first argument is a dict, use its type.
        new = mappings[0].copy()
        mappings = mappings[1:]
    else:
        # No positional arguments, or the first argument is a
        # sequence of (key, value) pairs.
        new = dict()
    for m in mappings:
        new.update(m)
    new.update(kw)
    return new

An alternative might be to forgo the arbitrary keywords, and take a single keyword parameter that specifies the behavior on collisions:

另一种方法是放弃任意的关键字,而采取一个单一的关键字参数来指定碰撞的行为:

def merged(*mappings, on_collision=lambda k, v1, v2: v2):
    # implementation left as an exercise to the reader
Advantages

优点:

  • Most of the same advantages of the method solutions above.

    上述方法解决方案的大部分优点相同。

  • Doesn’t require a subclass to implement alternative behavior on collisions, just a function.

    不需要一个子类来实现碰撞时的替代行为,只需要一个函数。

Disadvantages

缺点:

  • May not be important enough to be a builtin.

    可能没有重要到足以成为一个内建程序。

  • Hard to override behavior if you need something like “first wins”, without losing the ability to process arbitrary keyword arguments.

    如果你需要像 "first win"这样的行为,在不失去处理任意关键字参数的能力的情况下,很难重写行为。

Examples

The authors of this PEP did a survey of third party libraries for dictionary merging which might be candidates for dict union.

This is a cursory list based on a subset of whatever arbitrary third-party packages happened to be installed on one of the authors’ computers, and may not reflect the current state of any package. Also note that, while further (unrelated) refactoring may be possible, the rewritten version only adds usage of the new operators for an apples-to-apples comparison. It also reduces the result to an expression when it is efficient to do so.

IPython/zmq/ipkernel.py

Before:

aliases = dict(kernel_aliases)
aliases.update(shell_aliases)

After:

aliases = kernel_aliases | shell_aliases

IPython/zmq/kernelapp.py

Before:

kernel_aliases = dict(base_aliases)
kernel_aliases.update({
    'ip' : 'KernelApp.ip',
    'hb' : 'KernelApp.hb_port',
    'shell' : 'KernelApp.shell_port',
    'iopub' : 'KernelApp.iopub_port',
    'stdin' : 'KernelApp.stdin_port',
    'parent': 'KernelApp.parent',
})
if sys.platform.startswith('win'):
    kernel_aliases['interrupt'] = 'KernelApp.interrupt'

kernel_flags = dict(base_flags)
kernel_flags.update({
    'no-stdout' : (
            {'KernelApp' : {'no_stdout' : True}},
            "redirect stdout to the null device"),
    'no-stderr' : (
            {'KernelApp' : {'no_stderr' : True}},
            "redirect stderr to the null device"),
})

After:

kernel_aliases = base_aliases | {
    'ip' : 'KernelApp.ip',
    'hb' : 'KernelApp.hb_port',
    'shell' : 'KernelApp.shell_port',
    'iopub' : 'KernelApp.iopub_port',
    'stdin' : 'KernelApp.stdin_port',
    'parent': 'KernelApp.parent',
}
if sys.platform.startswith('win'):
    kernel_aliases['interrupt'] = 'KernelApp.interrupt'

kernel_flags = base_flags | {
    'no-stdout' : (
            {'KernelApp' : {'no_stdout' : True}},
            "redirect stdout to the null device"),
    'no-stderr' : (
            {'KernelApp' : {'no_stderr' : True}},
            "redirect stderr to the null device"),
}

matplotlib/backends/backend_svg.py

Before:

attrib = attrib.copy()
attrib.update(extra)
attrib = attrib.items()

After:

attrib = (attrib | extra).items()

matplotlib/delaunay/triangulate.py

Before:

edges = {}
edges.update(dict(zip(self.triangle_nodes[border[:,0]][:,1],
             self.triangle_nodes[border[:,0]][:,2])))
edges.update(dict(zip(self.triangle_nodes[border[:,1]][:,2],
             self.triangle_nodes[border[:,1]][:,0])))
edges.update(dict(zip(self.triangle_nodes[border[:,2]][:,0],
             self.triangle_nodes[border[:,2]][:,1])))

Rewrite as:

edges = {}
edges |= zip(self.triangle_nodes[border[:,0]][:,1],
             self.triangle_nodes[border[:,0]][:,2])
edges |= zip(self.triangle_nodes[border[:,1]][:,2],
             self.triangle_nodes[border[:,1]][:,0])
edges |= zip(self.triangle_nodes[border[:,2]][:,0],
             self.triangle_nodes[border[:,2]][:,1])

matplotlib/legend.py

Before:

hm = default_handler_map.copy()
hm.update(self._handler_map)
return hm

After:

return default_handler_map | self._handler_map

numpy/ma/core.py

Before:

_optinfo = {}
_optinfo.update(getattr(obj, '_optinfo', {}))
_optinfo.update(getattr(obj, '_basedict', {}))
if not isinstance(obj, MaskedArray):
    _optinfo.update(getattr(obj, '__dict__', {}))

After:

_optinfo = {}
_optinfo |= getattr(obj, '_optinfo', {})
_optinfo |= getattr(obj, '_basedict', {})
if not isinstance(obj, MaskedArray):
    _optinfo |= getattr(obj, '__dict__', {})

praw/internal.py

Before:

data = {'name': six.text_type(user), 'type': relationship}
data.update(kwargs)

After:

data = {'name': six.text_type(user), 'type': relationship} | kwargs

pygments/lexer.py

Before:

kwargs.update(lexer.options)
lx = lexer.__class__(**kwargs)

After:

lx = lexer.__class__(**(kwargs | lexer.options))

requests/sessions.py

Before:

merged_setting = dict_class(to_key_val_list(session_setting))
merged_setting.update(to_key_val_list(request_setting))

After:

merged_setting = dict_class(to_key_val_list(session_setting)) | to_key_val_list(request_setting)

sphinx/domains/init.py

Before:

self.attrs = self.known_attrs.copy()
self.attrs.update(attrs)

After:

self.attrs = self.known_attrs | attrs

sphinx/ext/doctest.py

Before:

new_opt = code[0].options.copy()
new_opt.update(example.options)
example.options = new_opt

After:

example.options = code[0].options | example.options

sphinx/ext/inheritance_diagram.py

Before:

n_attrs = self.default_node_attrs.copy()
e_attrs = self.default_edge_attrs.copy()
g_attrs.update(graph_attrs)
n_attrs.update(node_attrs)
e_attrs.update(edge_attrs)

After:

g_attrs |= graph_attrs
n_attrs = self.default_node_attrs | node_attrs
e_attrs = self.default_edge_attrs | edge_attrs

sphinx/highlighting.py

Before:

kwargs.update(self.formatter_args)
return self.formatter(**kwargs)

After:

return self.formatter(**(kwargs | self.formatter_args))

sphinx/quickstart.py

Before:

d2 = DEFAULT_VALUE.copy()
d2.update(dict(("ext_"+ext, False) for ext in EXTENSIONS))
d2.update(d)
d = d2

After:

d = DEFAULT_VALUE | dict(("ext_"+ext, False) for ext in EXTENSIONS) | d

sympy/abc.py

Before:

clash = {}
clash.update(clash1)
clash.update(clash2)
return clash1, clash2, clash

After:

return clash1, clash2, clash1 | clash2

sympy/parsing/maxima.py

Before:

dct = MaximaHelpers.__dict__.copy()
dct.update(name_dict)
obj = sympify(str, locals=dct)

After:

obj = sympify(str, locals=MaximaHelpers.__dict__|name_dict)

sympy/printing/ccode.py and sympy/printing/fcode.py

Before:

self.known_functions = dict(known_functions)
userfuncs = settings.get('user_functions', {})
self.known_functions.update(userfuncs)

After:

self.known_functions = known_functions | settings.get('user_functions', {})

sympy/utilities/runtests.py

Before:

globs = globs.copy()
if extraglobs is not None:
    globs.update(extraglobs)

After:

globs = globs | (extraglobs if extraglobs is not None else {})

The above examples show that sometimes the | operator leads to a clear increase in readability, reducing the number of lines of code and improving clarity. However other examples using the | operator lead to long, complex single expressions, possibly well over the PEP 8 maximum line length of 80 columns. As with any other language feature, the programmer should use their own judgement about whether | improves their code.

Related Discussions

相关讨论

Mailing list threads (this is by no means an exhaustive list):

Ticket on the bug tracker

Merging two dictionaries in an expression is a frequently requested feature. For example:

https://stackoverflow.com/questions/38987/how-to-merge-two-dictionaries-in-a-single-expression

https://stackoverflow.com/questions/1781571/how-to-concatenate-two-dictionaries-to-create-a-new-one-in-python

https://stackoverflow.com/questions/6005066/adding-dictionaries-together-python

Occasionally people request alternative behavior for the merge:

https://stackoverflow.com/questions/1031199/adding-dictionaries-in-python

https://stackoverflow.com/questions/877295/python-dict-add-by-valuedict-2

…including one proposal to treat dicts as sets of keys.

Ian Lee’s proto-PEP, and discussion in 2015. Further discussion took place on Python-Ideas.

(Observant readers will notice that one of the authors of this PEP was more skeptical of the idea in 2015.)

Adding a full complement of operators to dicts.

Discussion on Y-Combinator.

https://treyhunner.com/2016/02/how-to-merge-dictionaries-in-python/

https://code.tutsplus.com/tutorials/how-to-merge-two-python-dictionaries–cms-26230

In direct response to an earlier draft of this PEP, Serhiy Storchaka raised a ticket in the bug tracker to replace the copy(); update() idiom with dict unpacking.

Copyright

This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.

Source: https://github.com/python/peps/blob/master/pep-0584.rst

版权声明:本文由 icexmoon 翻译,遵循CC 4.0 BY-SA版权协议。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值