如何理解python_如何理解 Python 的 Descriptor？-CSDN博客

旧文搬运：

>>> class MyInt(int):

... def square(self):

... return self*self

...

>>> n = MyInt(2)

>>> n.name = 'two'

>>> n.square()

>>> n.name

'two'

小测验：上面代码的最后4行，n.square和n.name分别在几个对象的__dict__中查找'square'或'name'？

1个？2个？答案是2和4。n.square需要查找MyInt和n，n.name需要查找MyInt, int, object和n，查找顺序就如我列出来这样。

我们知道对象的属性覆盖类的属性：

>>> MyInt.name = 'MyInt'

>>> n.name

'two'

既然如此为什么查找顺序不反过来：先查找n，既然n.__dict__['name']存在就不需要再查找3个类？原因在于Python 2.2引入的新API: descriptor。

Python中不仅是类，实例也可以有自己的方法：

>>> def hello():

... print 'hello'

...

>>> n.hello = hello

>>> n.hello()

hello

MyInt.square比hello多一个self参数，为什么都可以用n.foo()形式来调用？因为前者是"方法"类型而后者是"函数"类型？不，我们已经知道class关键字不会改变def的语义：

>>> type(MyInt.__dict__['square'])

>>> type(n.__dict__['hello'])

Python在这里耍了个小花招：当在n中找不到属性square，而在n.__class__(即MyInt)中找到，而且MyInt.square是函数时，不直接返回这个函数，而是创建一个wrapper：

>>> n.square

>>> type(n.square)

Wrapper中包含了n的引用，或者说，square的self参数被绑定到n。在有new-style class之前(如Python 1.5.2)，这个查找过程大概是这样(实际的代码是C语言)：

def instance_getattr(obj, name):

'Look for attribute /name/ in object /obj/.'

v = obj.__dict__.get(name)

if v is not None:

# found in object

return v

v, cls = class_lookup(obj.__class__, name)

# found v in class cls

if isinstance(v, types.FunctionType): # Note this line

# function type. build method wrapper

return BoundMethod(v, obj, cls)

if v is not None:

# data attribute

return v

raise AttributeError(obj.__class__, name)

def class_lookup(cls, name):

'Look for attribute /name/ in class /cls/ and bases.'

v = cls.__dict__.get(name)

if v is not None:

# found in this class

return v, cls

# search in base classes

for i in cls.__bases__:

v, c = class_lookup(i, name)

if v is not None:

return v, c

# not found

return None, None

这个机制也算简单有效。可是当Python开发者们准备用new-style class整理类型系统时，下面这几行代码就显得有些扎眼：

if isinstance(v, types.FunctionType):

# function type. build method wrapper

return BoundMethod(v, obj, cls)

Python的风格是不太鼓励用isinstance的，因为它不符合duck typing的精神：不要问我是什么，问我能做什么。函数属性需要创建wrapper而数据属性不需要，这是Python的基本设计，不需要改动也不能改动。但是我们可以把这个规则一般化：

给"像函数的"属性创建wrapper，而不给"像数据的"属性创建。Any software problem can be solved by adding another layer of indirection.

if v.like_a_function():

# function-like type. build method wrapper

return BoundMethod(v, obj, cls)

一不做，二不休，为什么不让对象自己决定怎样创建wrapper?

...

if hasattr(v, '__get__'):

# anything with a '__get__' attribute is

# a function-like descriptor

return v.__get__(obj, obj.__class__)

...

class FunctionType(object):

...

def __get__(self, obj, cls):

return BoundMethod(v, obj, cls)

好了，我们得到了descriptor的雏形。现在任何对象都可以模仿函数的行为，即使作为方法也没有问题。但是，潘多拉的盒子已经打开，开发者们不会就此止步的。对灵活性的追求永无止境。。。

比如，staticmethod把函数的绑定方式变为"不绑定"：

class StaticMethod(object):

def __init__(self, f):

self.f = f

def __get__(self, obj, cls):

return self.f

class C(object):

@StaticMethod

def f(): # no self param

pass

或者，log每次函数调用：

>>> import types

>>> class Log(object):

... def __init__(self, f):

... self.f = f

... def __get__(self, obj, cls):

... print self.f.__name__, 'called'

... return types.MethodType(self.f, obj, cls)

...

>>> class C(object):

... @Log

... def f(self):

... pass

...

>>> c = C()

>>> c.f()

f called

Descriptor也不仅限于用在函数上。立即想到的是用它来做property。可是用__get__只能做出readonly property，那就再加个__set__吧：

>>> class Property(object):

... def __init__(self, fget, fset):

... self.fget = fget

... self.fset = fset

... def __get__(self, obj, cls):

... return self.fget(obj)

... def __set__(self, obj, val):

... self.fset(obj, val)

...

>>> class C(object):

... def fget(self):

... print 'fget called'

... def fset(self, val):

... print 'fset called with', val

... f = Property(fget, fset)

...

>>> c = C()

>>> c.f

fget called

>>> c.f = 1

fset called with 1

且慢，上面这段代码要能正常工作，还要克服一个困难：赋值总是作用于实例，根本不会去类中查找：

>>> c = C()

>>> c.n

>>> c.n = 1

>>> c.n

>>> C.n

这样一来，c.f = 1这个操作根本不会查找到我们在类中定义的property f，__set__方法也无从发挥作用。所以，我们只能改变赋值操作的语义，让类里定义的descriptor能够拦截对实例的属性赋值。现在要先在类和基类中查找名为'f'，而且定义了__set__方法的descriptor，只有找不到时，才在实例中进行赋值。

可是，我们之前为函数设计的__get__方法，查找顺序是在实例属性之后的；而__set__方法查找顺序又必须在实例属性之前。如果同一个descriptor的两个方法查找顺序竟然不一样，那看上去可不太美。怎么解决descriptor用于函数和property时，对查找顺序的不同要求呢？

Python的解决方法说也简单：如果一个descriptor只有__get__方法(如FunctionType)，我们就认为它是function-like descriptor，适用"实例-类-基类"的普通查找顺序；如果它有__set__方法(如Property)，就是data-like descriptor，适用"类-基类-实例"的特殊查找顺序。但是找到descriptor之前又怎么可能知道它的类型呢？所以无论如何都得先查找类和基类，再根据是否找到descriptor，和descriptor的类型，来决定是否需要查找实例。现在的查找算法成了这样：

def object_getattr(obj, name):

'Look for attribute /name/ in object /obj/.'

# First look in class and base classes.

v, cls = class_lookup(obj.__class__, name)

if (v is not None) and hasattr(v, '__get__') and hasattr(v, '__set__'):

# Data descriptor. Overrides instance member.

return v.__get__(obj, cls)

w = obj.__dict__.get(name)

if w is not None:

# Found in object

return w

if v is not None:

if hasattr(v, '__get__'):

# Function-like descriptor.

return v.__get__(obj, cls)

else:

# Normal data member in class

return v

raise AttributeError(obj.__class__, name)

现在我们可以回答第一节末尾的问题了。接触descriptor之前，每个人概念里的查找顺序大概都是"实例-类-基类"，而实际的查找过程却是"类-基类-实例"。概念上实例属性应该只需一次查找，实际上却是查找次数最多的(需要查找全部基类)；查找次数最少的是方法(2次：类-找到function-like descriptor，实例-未找到)。另一个意外的结果是基类越多，查找实例属性越慢，尽管这个查找看上去和基类不相干。好在Python是动态类型，类层次一般不深。

这一切都是为了支持property。值不值得呢？能在类上拦截对实例属性的访问，由此可以引出很多有趣的用法，和metaclass结合起来更是如此。对于Python来说"性能"似乎从来不是牺牲"功能"(以及其他各种美德)的理由，这次也不例外。