python的super详解(一)

一直对super很模糊,先看看一篇最详细的super解释。(后面见中文翻译)

 

If you aren’t wowed by Python’s super() builtin, chances are you don’t really know what it is capable of doing or how to use it effectively.

Much has been written about super() and much of that writing has been a failure. This article seeks to improve on the situation by:

  • providing practical use cases
  • giving a clear mental model of how it works
  • showing the tradecraft for getting it to work every time
  • concrete advice for building classes that use super()
  • favoring real examples over abstract ABCD diamond diagrams.

The examples for this post are available in both Python 2 syntax and Python 3 syntax.

Using Python 3 syntax, let’s start with a basic use case, a subclass for extending a method from one of the builtin classes:

class LoggingDict(dict):
    def __setitem__(self, key, value):
        logging.info('Settingto %r' % (key, value))
        super().__setitem__(key, value)

This class has all the same capabilities as its parent, dict, but it extends the setitem method to make log entries whenever a key is updated. After making a log entry, the method uses super() to delegate the work for actually updating the dictionary with the key/value pair.

Before super() was introduced, we would have hardwired the call with dict.__setitem__(self, key, value). However, super() is better because it is a computed indirect reference.

One benefit of indirection is that we don’t have to specify the delegate class by name. If you edit the source code to switch the base class to some other mapping, the super() reference will automatically follow. You have a single source of truth:

class LoggingDict(SomeOtherMapping):            # new base class
    def __setitem__(self, key, value):
        logging.info('Settingto %r' % (key, value))
        super().__setitem__(key, value)         # no change needed

In addition to isolating changes, there is another major benefit to computed indirection, one that may not be familiar to people coming from static languages. Since the indirection is computed at runtime, we have the freedom to influence the calculation so that the indirection will point to some other class.

The calculation depends on both the class where super is called and on the instance’s tree of ancestors. The first component, the class where super is called, is determined by the source code for that class. In our example, super() is called in the LoggingDict.setitem method. That component is fixed. The second and more interesting component is variable (we can create new subclasses with a rich tree of ancestors).

Let’s use this to our advantage to construct a logging ordered dictionary without modifying our existing classes:

class LoggingOD(LoggingDict, collections.OrderedDict):
    pass

The ancestor tree for our new class is: LoggingOD, LoggingDict, OrderedDict, dict, object. For our purposes, the important result is that OrderedDict was inserted after LoggingDict and before dict! This means that the super() call in LoggingDict.setitem now dispatches the key/value update to OrderedDict instead of dict.

Think about that for a moment. We did not alter the source code for LoggingDict. Instead we built a subclass whose only logic is to compose two existing classes and control their search order.


Search Order

What I’ve been calling the search order or ancestor tree is officially known as the Method Resolution Order or MRO. It’s easy to view the MRO by printing the mro attribute:

>>> pprint(LoggingOD.__mro__)
(<class '__main__.LoggingOD'>,
 <class '__main__.LoggingDict'>,
 <class 'collections.OrderedDict'>,
 <class 'dict'>,
 <class 'object'>)

If our goal is to create a subclass with an MRO to our liking, we need to know how it is calculated. The basics are simple. The sequence includes the class, its base classes, and the base classes of those bases and so on until reaching object which is the root class of all classes. The sequence is ordered so that a class always appears before its parents, and if there are multiple parents, they keep the same order as the tuple of base classes.

The MRO shown above is the one order that follows from those constraints:

  • LoggingOD precedes its parents, LoggingDict and OrderedDict
  • LoggingDict precedes OrderedDict because LoggingOD.bases is (LoggingDict, OrderedDict)
  • LoggingDict precedes its parent which is dict
  • OrderedDict precedes its parent which is dict
  • dict precedes its parent which is object

The process of solving those constraints is known as linearization. There are a number of good papers on the subject, but to create subclasses with an MRO to our liking, we only need to know the two constraints: children precede their parents and the order of appearance in bases is respected.


Practical Advice

super() is in the business of delegating method calls to some class in the instance’s ancestor tree. For reorderable method calls to work, the classes need to be designed cooperatively. This presents three easily solved practical issues:

the method being called by super() needs to exist
the caller and callee need to have a matching argument signature
and every occurrence of the method needs to use super()

First

Let’s first look at strategies for getting the caller’s arguments to match the signature of the called method. This is a little more challenging than traditional method calls where the callee is known in advance. With super(), the callee is not known at the time a class is written (because a subclass written later may introduce new classes into the MRO).

One approach is to stick with a fixed signature using positional arguments. This works well with methods likesetitem which have a fixed signature of two arguments, a key and a value. This technique is shown in the LoggingDict example where setitem has the same signature in both LoggingDict and dict.

A more flexible approach is to have every method in the ancestor tree cooperatively designed to accept keyword arguments and a keyword-arguments dictionary, to remove any arguments that it needs, and to forward the remaining arguments using **kwds, eventually leaving the dictionary empty for the final call in the chain.
Each level strips-off the keyword arguments that it needs so that the final empty dict can be sent to a method that expects no arguments at all (for example, object.init expects zero arguments):

class Shape:
    def __init__(self, shapename, **kwds):
        self.shapename = shapename
        super().__init__(**kwds)        

class ColoredShape(Shape):
    def __init__(self, color, **kwds):
        self.color = color
        super().__init__(**kwds)

cs = ColoredShape(color='red', shapename='circle')

Second

Having looked at strategies for getting the caller/callee argument patterns to match, let’s now look at how to make sure the target method exists.

The above example shows the simplest case. We know that object has an init method and that object is always the last class in the MRO chain, so any sequence of calls to super().init is guaranteed to end with a call to object.init method. In other words, we’re guaranteed that the target of the super() call is guaranteed to exist and won’t fail with an AttributeError.

For cases where object doesn’t have the method of interest (a draw() method for example), we need to write a root class that is guaranteed to be called before object. The responsibility of the root class is simply to eat the method call without making a forwarding call using super().

Root.draw can also employ defensive programming using an assertion to ensure it isn’t masking some other draw() method later in the chain. This could happen if a subclass erroneously incorporates a class that has a draw() method but doesn’t inherit from Root.:

class Root:
    def draw(self):
        # the delegation chain stops here
        assert not hasattr(super(), 'draw')

class Shape(Root):
    def __init__(self, shapename, **kwds):
        self.shapename = shapename
        super().__init__(**kwds)
    def draw(self):
        print('Drawing.  Setting shape to:', self.shapename)
        super().draw()

class ColoredShape(Shape):
    def __init__(self, color, **kwds):
        self.color = color
        super().__init__(**kwds)
    def draw(self):
        print('Drawing.  Setting color to:', self.color)
        super().draw()

cs = ColoredShape(color='blue', shapename='square')
cs.draw()

If subclasses want to inject other classes into the MRO, those other classes also need to inherit from Root so that no path for calling draw() can reach object without having been stopped by Root.draw. This should be clearly documented so that someone writing new cooperating classes will know to subclass from Root. This restriction is not much different than Python’s own requirement that all new exceptions must inherit from BaseException.

Third

The techniques shown above assure that super() calls a method that is known to exist and that the signature will be correct; however, we’re still relying on super() being called at each step so that the chain of delegation continues unbroken. This is easy to achieve if we’re designing the classes cooperatively – just add a super() call to every method in the chain.

The three techniques listed above provide the means to design cooperative classes that can be composed or reordered by subclasses.


How to Incorporate a Non-cooperative Class

Occasionally, a subclass may want to use cooperative multiple inheritance techniques with a third-party class that wasn’t designed for it (perhaps its method of interest doesn’t use super() or perhaps the class doesn’t inherit from the root class). This situation is easily remedied by creating an adapter class that plays by the rules.

For example, the following Moveable class does not make super() calls, and it has an init() signature that is incompatible with object.init, and it does not inherit from Root:

class Moveable:
    def __init__(self, x, y):
        self.x = x
        self.y = y
    def draw(self):
        print('Drawing at position:', self.x, self.y)

If we want to use this class with our cooperatively designed ColoredShape hierarchy, we need to make an adapter with the requisite super() calls:

class MoveableAdapter(Root):
    def __init__(self, x, y, **kwds):
        self.movable = Moveable(x, y)
        super().__init__(**kwds)
    def draw(self):
        self.movable.draw()
        super().draw()

class MovableColoredShape(ColoredShape, MoveableAdapter):
    pass

MovableColoredShape(color='red', shapename='triangle',
                    x=10, y=20).draw()

Complete Example – Just for Fun

In Python 2.7 and 3.2, the collections module has both a Counter class and an OrderedDict class. Those classes are easily composed to make an OrderedCounter:

from collections import Counter, OrderedDict

class OrderedCounter(Counter, OrderedDict):
     'Counter that remembers the order elements are first seen'
     def __repr__(self):
         return '%s(%r)' % (self.__class__.__name__,
                            OrderedDict(self))
     def __reduce__(self):
         return self.__class__, (OrderedDict(self),)

oc = OrderedCounter('abracadabra')

Notes and References

  • When subclassing a builtin such as dict(), it is often necessary to override or extend multiple methods at a time. In the above examples, the setitem extension isn’t used by other methods such as dict.update, so it may be necessary to extend those also. This requirement isn’t unique to super(); rather, it arises whenever builtins are subclassed.
  • If a class relies on one parent class preceding another (for example, LoggingOD depends on LoggingDict coming before OrderedDict which comes before dict), it is easy to add assertions to validate and document the intended method resolution order:
position = LoggingOD.__mro__.index
assert position(LoggingDict) < position(OrderedDict)
assert position(OrderedDict) < position(dict)
  • Good write-ups for linearization algorithms can be found at Python MRO documentation and at Wikipedia entry for C3 Linearization.
  • The Dylan programming language has a next-method construct that works like Python’s super(). See Dylan’s class docs for a brief write-up of how it behaves.
  • The Python 3 version of super() is used in this post. The full working source code can be found at: Recipe 577720. The Python 2 syntax differs in that the type and object arguments to super() are explicit rather than implicit. Also, the Python 2 version of super() only works with new-style classes (those that explicitly inherit from object or other builtin type). The full working source code using Python 2 syntax is at Recipe 577721.

Acknowledgements

Serveral Pythonistas did a pre-publication review of this article. Their comments helped improve it quite a bit.

They are: Laura Creighton, Alex Gaynor, Philip Jenvey, Brian Curtin, David Beazley, Chris Angelico, Jim Baker, Ethan Furman, and Michael Foord. Thanks one and all.

翻译:

 

如果你没有被Python的super()惊愕过,那么要么是你不了解它的威力,要么就是你不知道如何高效地使用它。

有许多介绍super()的文章,这一篇与其它文章的不同之处在于:

  • 提供了实例
  • 阐述了它的工作模型
  • 展示了任何场景都能使用它的手段
  • 有关使用super()的类的具体建议
  • 基于抽象ABCD钻石模型的实例

下面是一个使用Python 3语法,扩展了builtin类型dict中方法的子类:

import pprint
import logging
import collections

class LoggingDict(dict):

    def __setitem__(self, key, value):
        logging.info('Setting %r to %r' % (key, value))
        super().__setitem__(key, value)

LoggingDict继承了父类dict的所有特性,同时其扩展了__setitem__方法来记录被设置的key;在记录日志之后,该方法用super()将真正的更新操作代理给其父类。

我们可以使用dict.__setitem__(self, key, value)来完成super()的功能,但是super()更优,因为它是一个计算出来的间接引用。

间接的一个好处是,我们不需要使用名字来指定代理类。如果你将基类换成其它映射(mapping)类型,super()引用将会自动调整。你只需要一份代码:

class LoggingDict(someOtherMapping):                     # 新的基类

    def __setitem__(self, key, value):
        logging.info('Setting %r to %r' % (key, value))
        super().__setitem__(key, value)                  # 无需改变

对于计算出的间接引用,其除了隔离变化外,依赖于Python的动态性,可以在运行时改变其指向的class。

计算取决于类在何处被调用以及实例的继承树;super在何处调用取决于类的源码,在上例中super()是在LoggingDict.__setitem__方法中被调用的;实例的继承树在后文详述。

下面先构造一个有序的logging字典:

class LoggingOD(LoggingDict, collections.OrderedDict):
    pass

新class的继承树是:LoggingODLoggingDictOrderedDictdictobject。出人意料的是OrderedDict竟然介于LoggingDict之后和dict之前,这意味着LoggingDict.__setitem__super()调用会将键/值的更新委托给OrderedDict而不是dict

在上例中,我们并没有修改LoggingDict的源码,只是创建了一个子类,这个子类的唯一逻辑是组合两个已有的类并控制它们的搜索顺序(search order)。

Search Order

上面提到的搜索顺序或者继承树的官方称谓是Method Resolution Order(方法解析顺序)即MRO。可以用__mro__属性方便地打印出对象的MRO:

pprint.pprint(LoggingOD.__mro__)
(<class '__main__.LoggingOD'>,
 <class '__main__.LoggingDict'>,
 <class 'collections.OrderedDict'>,
 <class 'dict'>,
 <class 'object'>)

如果我们想创建出其MRO符合我们意愿的子类,就必须知道它是如何计算的。MRO的计算很简单,MRO序列包含类、类的基类以及基类们的基类......这个过程持续到到达objectobject是所有类的根类;这个序列中,子类总是出现在其父类之前,如果一个子类有多个父类,父类按照子类定义中的基类元组的顺序排列。

上例中MRO是按照这些约束计算出来的:

  • LoggingOD在其父类LoggingDict, OrderedDict之前
  • LoggingDict在OrderedDict之前是因为LoggingOD.bases是(LoggingDict, OrderedDict)
  • LoggingDict在它的父类dict之前
  • OrderedDict在它的父类dict之前
  • dict在它的父类object之前

解析这些约束的过程称为线性化(linearization)。创建出MRO符合我们期望的子类只需知道两个约束:子类在父类之前;符合__bases__里的顺序。

Practical Advice

super()用来将方法调用委托给其继承树中的一些类。为了让super能正常作用,类需要协同设计。下面是三条简单的解决实践:

  • 被调用的super()需存在
  • 调用者和被调用者的参数签名需匹配
  • 方法的任何出现都需要使用super()

1) 先来看一下使调用者和被调者参数签名匹配的策略。对于一般的方法调用而言,被调者在被调之前其信息是已经获知的;然而对于super(),直到运行时才能确定被调者(因为后面定义的子类可能会在MRO中引入新的类)。

一种方法是使用positional参数固定签名。这种方法对于像__setitem__这种只有两个参数的固定签名是适用的。LoggingDict例子中__setitem__的签名和dict中一致。

另一种更灵活的方法是规约继承树中的每一个方法都被设计成接受keyword参数和一个keyword参数字典,“截留住”自身需要的参数,然后将剩下的参数使用**kwds转发至父类中的方法,使得调用链中的最后一次调用中参数字典为空(即沿着继承树一层一层地将参数剥离,每层都留下自己需要的,将余下的参数传递给基类)。

每一层都会剥离其所需的参数,这样就能保证最终将空字典传递给不需要参数的方法(比如,object.__init__不需要参数):

class Shape:

    def __init__(self, shapename, **kwds):
        self.shapename = shapename
        super().__init__(**kwds)


class ColoredShape(Shape):

    def __init__(self, color, **kwargs):
        self.color = color
        super().__init__(**kwargs)

cs = ColoredShape(color='red', shapename='circle')

2) 现在来看一下如何保证目标方法存在。

上例仅展示了最简单的情形。我们知道object有一个__init__方法,它也总是MRO链中最后一个类,因此任意数量的super().__init__最终都会以调用object.__init__结束。换言之,我们可以保证调用继承树上任意对象的super().__init__方法都不会以产生AttributeError而失败。

对于object没有的方法(比如draw()方法),我们需要写一个根类并保证它在object对象之前被调用。根类的作用仅仅是将方法调用“截住”而不会再进一步调用super()。

Root.draw也可以使用defensive programming策略使用断言来保证draw()方法不会再被调用。

class Root:

    def draw(self):
        #: 代理调用链止于此
        assert not hasattr(super(), 'draw')


class Shape(Root):

    def __init__(self, shapename, **kwds):
        self.shapename = shapename
        super().__init__(**kwds)

    def draw(self):
        print('Drawing. Setting shape to:', self.shapename)
        super().draw()


class ColoredShape(Shape):

    def __init__(self, color, **kwds):
        self.color = color
        super().__init__(**kwds)

    def draw(self):
        print('Drawing. Setting color to:', self.color)
        super().draw()

cs = ColoredShape(color='blue', shapename='square')
cs.draw()
Drawing. Setting color to: blue
Drawing. Setting shape to: square

如果子类想在MRO中注入其它类,那么这些类也需要继承自Root,这样在继承路径上的任何类调用draw()方法都不会最终代理至object而抛出AttributeError。这一规定需在文档中明确,这样别人在写新类的时候才知道需要继承Root。这一约束与Python中要求所有异常必须继承自BaseException并无不同。

3) 上面讨论的两点保证了方法的存在以及签名的正确,然而我们还必须保证在代理链上的每一步中都super()都被调用。这一目标很容易达成,只需要协同设计每一个相关类——在代理链上的每一步中增加一个supper()

How to Incorporate a Non-cooperative Class

在某些场景下,子类可能希望使用多重继承,其大部分父类都是协同设计的,同时也需要继承自一个第三方类(可能将要使用的方法没有使用super或者该类没有继承自根类)。这种情形很容易通过使用adapter class来解决。

例如,下面的Moveable类没有调用super(),init()函数签名与object.init也不兼容,而且它不集成自Root:

class Moveable:

    def __init__(self, x, y):
        self.x = x
        self.y = y

    def draw(self):
        print('Drawing at position:', self.x, self.y)

如果我们想将这个类与之前协同设计的ColoredShape层级一起使用的话,我们需要创建一个适配器(adapter),它调用了必须的super()方法。

class MoveableAdapter(Root):

    def __init__(self, x, y, **kwds):
        self.moveable = Moveable(x, y)
        super().__init__(**kwds)

    def draw(self):
        self.moveable.draw()
        super().draw()


class MoveableColoredShape(ColoredShape, MoveableAdapter):
    pass


MoveableColoredShape(color='red', shapename='triangle', x=10, y=20).draw()
Drawing. Setting color to: red
Drawing. Setting shape to: triangle
Drawing at position: 10 20

Complete Example - Just for Fun

在Python 2.7和3.2中,collections模块有一个Counter类和一个OrderedDict类,可以将这两个类组合产生一个OrderedCounter类:

from collections import Counter, OrderedDict


class OrderedCounter(Counter, OrderedDict):

    'Counter that remembers the order elements are first seen'
    def __repr__(self):
        return '%s(%r)' % (self.__class__.__name__,
                           OrderedDict(self))

    def __reduce__(self):
        return self.__class__, (OrderedDict(self),)

oc = OrderedCounter('abracadabra')
pprint.pprint(oc)
OrderedCounter(OrderedDict([('a', 5), ('b', 2), ('r', 2), ('c', 1), ('d', 1)]))

Notes and References

  • 当子类化诸如dict()的builtin时,通常需要同时重载或者扩展多个方法。在上面的例子中,_setitem_扩展不能用于诸如dict.update等其它方法,因此可能需要扩展这些方法。这一需求不止限于super(),子类化builtins时都需要。

  • 在多继承中,如果要求父类按照指定顺序(例如,LoggingOD需要LoggingDict在OrderedDict前面,而OrderedDict在dict前面),可以利用断言来验证或者使用文档来表明方法解析顺序:

position = LoggingOD.__mro__.index
assert position(LoggingDict) < position(collections.OrderedDict)
assert position(OrderedDict) < position(dict)

参考:

https://github.com/SimonXming/my-blog/issues/9

https://harveyqing.gitbooks.io/python-read-and-write/content/python_advance/python_super_considered_super.html

https://fuhm.net/super-harmful/

评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值