python鸭兔的腿
You are not alone if you often get frustrated by “duck typing” in Python. By duck typing I mean if something walks like a duck and quacks like a duck, then, for sure, it is a duck! The apparent laxity in Python type system or its absence thereof is a potential source of huge number of bugs that props up in production time. Well, teams routinely employ robust “testing” to catch them but it’s usually never enough. Without strict type checking at compile time, more often than not bugs get caught only at the run time, which is ultimately not desirable. There are several ways to obviate this problem, and I am going to discuss a robust but less known one — descriptors.
如果您经常对Python中的“鸭子输入”感到沮丧,那么您并不孤单。 鸭子打字是指如果某物走路像鸭子,而嘎嘎叫鸭子,那肯定是鸭子! Python类型系统中的明显松懈或缺少这种缺陷可能是大量在生产期间出现错误的潜在来源。 好吧,团队通常会采用强大的“测试”来抓住他们,但通常这远远不够。 如果没有在编译时进行严格的类型检查,则错误往往仅在运行时被捕获,这最终是不希望的。 有几种方法可以解决此问题,我将讨论一个健壮但鲜为人知的描述符-描述符。
Let’s start with a very simple class named Point
that takes abstracts a 3D coordinate.
让我们从一个名为Point
的非常简单的类开始,该类可以提取3D坐标。
class Point:
"""
A point in 3D space
"""
def __init__(self, x, y, z):
self.x = x
self.y = y
self.z = z
def __repr__(self):
return f"Point({self.x}, {self.y}, {self.z})"
...
This class has no type checking as such. One can instantiate with any valid datatype in Python, which obviously leads to not so obvious problems.
此类没有类型检查。 可以使用Python中的任何有效数据类型实例化,这显然不会导致那么明显的问题。
Point("x", 1.0, 3)
Without strict type checking, one can instantiate the class as mentioned above, which is not what the end user might desire. How can we address this problem without making substantial modification? A simple, albeit not so elegant approach is to use assertions.
如果没有严格的类型检查,则可以如上所述实例化该类,这不是最终用户可能想要的。 我们如何不做大量修改就解决这个问题? 一个简单的,尽管不是很优雅的方法是使用断言。
assert isinstance(self.x, float) and isinstance(self.y, float) and isinstance(self.z, float)
This assertion will coerce the user to instantiate this class with only float arguments. However, as you might have guessed, this is not a very flexible approach and we might have to write several assertions to satisfy what we want. As a matter of fact, one can even envision writing try — except block for type converting the user supplied arguments in our __init__
function. This again has several pitfalls and code becomes unnecessarily complex and long.
此断言将迫使用户仅使用float参数实例化此类。 但是,正如您可能已经猜到的那样,这不是一种非常灵活的方法,我们可能必须编写一些断言来满足我们的需求。 实际上,甚至可以设想编写try —除了在__init__
函数中用于类型转换用户提供的参数的块之外。 这又有一些陷阱,并且代码变得不必要地复杂和冗长。
A better way to accomplish type checking is creating managed attributes. We customise access to an attribute by defining it as a property.
完成类型检查的更好方法是创建托管属性。 我们通过将属性定义为属性来自定义对属性的访问。
class Point:
"""
A point in 3D space
"""
def __init__(self, x, y, z):
self.x = x
self.y = y
self.z = z
# Getter @property
def x(self):
return self._x
# Setter @x.setter
def x(self, value):
if not isinstance(value, (int,float)):
raise TypeError("Expected a float or int")
else:
self._x = float(value)
# Getter @property
def y(self):
return self._y
# Setter @y.setter
def y(self, value):
if not isinstance(value, (int,float)):
raise TypeError("Expected a float or int")
else:
self._y = float(value) ... def __repr__(self):
return f"Point({self.x}, {self.y}, {self.z})"
The above piece of code defines x, y, and z as property and add type checking followed by converting arguments to them into float in the setter function. Note that we need to define x,y and z as property in the getter function before we can define the setter function. Now, if a user tries to access x, y or z attribute, that automatically triggers the getter and setter methods. There is an optional deleter method that I have not defined above, as it is optional and prevents accidental deletion of crucial attributes by the user of our code. When a user accesses/modifies any of the property, our program manipulates self._x, self._y or self._z attribute directly, which is where the actual data lives.
上面的代码将x,y和z定义为属性,并添加类型检查,然后在setter函数中将其参数转换为float。 注意,在定义setter函数之前,需要先在getter函数中将x,y和z定义为属性。 现在,如果用户尝试访问x,y或z属性,则会自动触发getter和setter方法。 我上面没有定义一个可选的deleter方法,因为它是可选的,可以防止我们的代码用户意外删除关键属性。 当用户访问/修改任何属性时,我们的程序将直接操作self._x,self._y或self._z属性,这就是实际数据所在的位置。
Use of property for type checking is a good approach when you few attributes, but this quickly leads to a bloated code when you try to define property for all attributes.
当属性很少时,使用属性进行类型检查是一种很好的方法,但是当您尝试为所有属性定义属性时,这很快导致代码膨胀。
A better and robust way to type checking is making use of descriptor class. Here, I will define a descriptor class for a float type-checked attribute.
一种更好,更强大的类型检查方法是使用描述符类。 在这里,我将为float类型检查的属性定义一个描述符类。
class Float:
def __init__(self, name):
self.name = name def __get__(self, instance, cls):
if instance is None:
return self
else:
return instance.__dict__[self.name] def __set__(self, instance, value): if not isinstance(value, float):
raise TypeError("Expected a float arg")
instance.__dict__[self.name] = value def __delete__(self, instance):
del instance.__dict__[self.name]
Let’s dissect this class, piece by piece, so to speak:
让我们逐节剖析该类,可以这么说:
Essentially, a descriptor class implements three crucial attribute access operations(get, set and delete), which are defined as __get__(), __set__(), and __delete__() respectively. These methods are special methods, as noted by double underscores in their names. They all receive and instance of a class as input and then they manipulate the underlying dictionary of the instance appropriately (instance.__dict__ is the dictionary I am referring to).
本质上,描述符类实现三个关键的属性访问操作(获取,设置和删除),分别定义为__get __(),__set __()和__delete __()。 这些方法是特殊方法,其名称中用双下划线表示。 它们都接收一个类的实例作为输入,然后适当地操纵该实例的基础词典(instance .__ dict__是我要引用的词典)。
How can we use it in our code? We need to place the instances of the descriptor class as class variables into the definition of Point class.
我们如何在代码中使用它? 我们需要将描述符类的实例作为类变量放入Point类的定义中。
class Point:
x = Float("x")
y = Float("y")
z = Float("z") def __init__(self, x, y, z):
self.x = x
self.y = y
self.z = z
def __repr__(self):
return f"Point({self.x!r}, {self.y!r}, {self.z!r})"
Now, all access to descriptor attributes (x, y or z) is captured by __get__(), __set__() and __delete__() methods. They all “indirectly” manipulate the Point object dictionary.
现在,所有对描述符属性(x,y或z)的访问都由__get __(),__ set __()和__delete __()方法捕获。 它们都“间接地”操纵Point对象字典。
>>> p1 = Point(2., 3., 0.)
>>> p1.z # it calls Point.z.__get__(p1, Point)
0.0
>>> p1.y = 5
(This raises TypeError("Expected float as arg"))
We can further refactor our code by defining a decorator:
我们可以通过定义装饰器来进一步重构代码:
def FloatTypeOnly(*args):
def decorate(cls): for name in args:
setattr(cls, name, Float(name))
return cls return decorate
This decorator can then be directly used when defining the Point class:
然后可以在定义Point类时直接使用此装饰器:
@FloatTypeOnly("x", "y", "z")
class Point: def __init__(self, x, y, z):
self.x = x
self.y = y
self.z = z
def __repr__(self):
return f"Point({self.x!r}, {self.y!r}, {self.z!r})"
To conclude, you might ponder about the obfustification introduced by adding descriptors for type checking and ask yourself the question whether it is really necessary to use them. For sure, there are alternate ways to do type checking like using third property libraries — mypy, but they cannot enforce strict type checking and are not compatible with all custom data types that you might intend to use in your code. Furthermore, by using descriptors you can completely customise what they do at a very low level, which is sometimes invaluable for debugging and imlementing custom behaviors.
总而言之,您可能会考虑添加用于类型检查的描述符所带来的繁琐功能,并问自己一个问题,是否真的需要使用它们。 当然,还有其他类型检查类型的方法,例如使用第三个属性库— mypy,但是它们不能强制执行严格的类型检查,并且与您可能打算在代码中使用的所有自定义数据类型不兼容。 此外,通过使用描述符,您可以在非常低的级别上完全自定义它们的操作,这对于调试和实施自定义行为有时是无价的。
python鸭兔的腿