关于类型提示如何改善代码的四个过度设计的示例 (Four over-engineered examples of how type hints can improve your code)
Type hint analysis, like unit tests, and static code analysis, all serve to give people an appropriate level of confidence that the code works as expected. They can be a helpful part of a Python program and are one of many tools we use to establish the overall quality of our code.
类型提示分析(如单元测试)和静态代码分析,都可以使人们对代码按预期工作的置信度适当。 它们可能是Python程序的有用部分,并且是我们用来建立代码整体质量的许多工具之一。
In this post I want to explore different ways type annotations can help write better software. In order to do this, I’m going to need a problem to solve. You can probably guess from the title what I’m going to tackle: the fizz-buzz problem. Several times. Four times to be precise. With different approaches and therefore distinct kinds of type hints.
在本文中,我想探讨类型注释可以帮助编写更好的软件的不同方法。 为了做到这一点,我将需要解决一个问题。 您可能会从标题中猜到我要解决的问题 : 嗡嗡声问题 。 几次。 准确地说是四次。 使用不同的方法,因此使用不同类型的类型提示。
为什么选择Fizz-Buzz? (Why Fizz-Buzz?)
In case you haven’t heard of the fizz-buzz problem, it’s based on a party game. People sit in a circle. They count off numbers, starting from “One.” Then the next person says, “Two.” So far, so good, right? Now comes the first exception to the rule. No one says a multiple of three, they say “Fizz” instead. So the next person says “Fizz,” and the person after them says “Four.” Now the next exception: no one says a multiple of five. Instead, they say “Buzz.” So the next person says “Buzz” instead of “Five.” Then “Fizz.” Then hilarity ensues because no one can remember which number comes after a Buzz and a Fizz. Eventually someone figures out it’s “Seven.” Then “Eight,” then “Fizz,” and “Buzz.”
如果您还没有听说过嘶嘶声问题,它是基于聚会游戏的。 人们围成一圈。 他们从“一个”开始计算数字。 然后下一个人说:“两个。” 到目前为止,一切都很好,对吗? 现在是该规则的第一个例外。 没有人说出三的倍数,而是说“嘶嘶响”。 因此,下一个人说“嘶嘶声”,然后第二个人说“四个”。 现在是下一个例外:没有人说五的倍数。 而是说“嗡嗡声”。 因此,下一个人说的是“嗡嗡声”,而不是“五个”。 然后是“嘶嘶声”。 随之而来的是欢闹,因为没人能记住在“嗡嗡声”和“嘶嘶声”之后是哪个数字。 最终有人发现它是“七”。 然后是“八个”,然后是“嘶嘶响”和“嗡嗡声”。
Some backstory on why we’re going to over-engineer this problem. For a few years, I lived on a sailboat. There’s a lot that can go wrong at sea, and the consequences of a failure can be dire. Therefore, many sailors will agree that anything worth engineering is worth over-engineering.
关于为什么我们要过度设计这个问题的一些背景故事。 几年来,我住在帆船上。 海上有很多可能出错的地方,而失败的后果可能是可怕的。 因此,许多水手会同意,任何值得工程设计的东西都值得过度设计 。
So, in this article I want to over-engineer the fizz-buzz problem four times over. It’s relatively simple, and therefore, we can look at it from a number of perspectives. But before we launch into over-engineering this problem, let’s talk about tools.
因此,在本文中,我想对fizz-buzz问题进行四次过度设计。 它相对简单,因此,我们可以从多个角度进行研究。 但是,在我们过度解决这个问题之前,让我们先讨论一下工具。
On the boat, we use winch handles to give us leverage over lines with heavy sails or chains with heavy anchors. An anchor might weigh 25 kg, and the chain 2.3 kg per meter. Anchoring in 10m of depth nearly doubles what has to be lifted; which is not safe to try by hand. In this situation, winches are essential, and long winch handles are very important.
在船上,我们使用绞盘手柄使我们能够利用帆粗的线或锚很重的链条。 一个锚可能重25公斤,链条每米重2.3公斤。 锚定在10m的深度中,几乎要吊运的货物翻了一番; 手动尝试并不安全。 在这种情况下,绞车必不可少,而长绞车手柄非常重要。
Now back to coding. By design, Python type hints have little influence on the run-time behavior of our code. They’re mostly used by tools like mypy. The mypy tool does static analysis of the code and the annotations to confirm the code fits the hints.
现在回到编码。 根据设计,Python类型提示对我们代码的运行时行为几乎没有影响。 它们主要由mypy等工具使用。 mypy工具会对代码和注释进行静态分析,以确认代码适合提示。
If you haven’t got it already, you’ll want to add it.
如果您还没有,请添加它。
python -m pip install mypy
I’m also going to suggest an arrangement for Python project files and folders. Not everyone loves this arrangement, but I think it works out well for most projects.
我还将建议对Python项目文件和文件夹进行安排。 并非每个人都喜欢这种安排,但我认为对于大多数项目而言,这种安排都很好。
Project
+-- src
+-- fizzbuzz.py
+-- tests
+-- test_fizzbuzz.py
+-- tox.ini
I’m going to focus on using mypy to check the source, so I’m going to quietly ignore the tests. I’ll leave those as an exercise for the reader.
我将专注于使用mypy来检查源,因此我将悄悄地忽略测试。 我将这些留给读者练习。
After installing mypy and setting up the two folders, here’s a first round implementation of fizzbuzz.py
安装mypy并设置两个文件夹后,这是fizzbuzz.py
的第一轮实现
print("1, 2, fizz, 4, buzz, fizz, 7, 8, fizz, buzz")
Yes, that script feels like cheating. Instead of stepping through the game algorithm, it has a hard-wired result of nine rounds of play. It does sort of work. It’s a huge pain to test because there’s no function or class a test case can import and exercise. No automated test means it may not be working.
是的,该脚本感觉像是在作弊。 它没有逐步完成游戏算法,而是产生了九轮比赛的硬连线结果。 它确实做了一些工作。 由于这没有测试用例可以导入和执行的功能或类,因此进行测试非常痛苦。 没有自动测试意味着它可能无法正常工作。
When this is set up, we should be able to enter mypy src to check the type annotations on this little file. There are no explicit type annotations. The one line of code matches the definition of the built-in print()
function. So, this line of code looks good to mypy.
设置好之后,我们应该能够输入mypy src来检查这个小文件上的类型注释。 没有显式的类型注释。 一行代码与内置print()
函数的定义匹配。 因此,这行代码对mypy来说不错。
This is a good time to write your own solution to fizz-buzz. Call it fizzbuzz1.py
to distinguish it from my not very good initial example.
这是编写自己的解决方案的好时机。 将其 fizzbuzz1.py
为 fizzbuzz1.py
可以将其与我的示例不是很好。
类型提示基础—低/无工程解决方案 (Type Hint Basics — The Low/No Engineered Solution)
In order to have something testable, it helps to break the fizz-buzz problem into functions. Here’s a decomposition that seems to solve the problem. Spoiler alert: it has bugs.
为了使某些东西可测试,它有助于将嘶嘶声问题分解为功能。 这是一个似乎可以解决问题的分解方法。 剧透警报 : 它有错误 。
def fizz_buzz(n):
if n % 3 == 0: return "Fizz"
elif n % 5 == 0: return "Buzz"
else: return nif __name__ == "__main__":
for i in range(10):
print(fizz_buzz(i))
This seems to work for numbers from 1 to 10. In spite of the logic problem, let’s add type hints. If you haven’t seen them before, they look like this:
这似乎适用于1到10的数字。尽管存在逻辑问题,但让我们添加类型提示。 如果您以前没有看过它们,它们将如下所示:
def fizz_buzz(n: int) -> str:
if n % 3 == 0: return "Fizz"
elif n % 5 == 0: return "Buzz"
else: return n
There are two changes to the code to annotate the expected types:
对代码进行了两种更改以注释期望的类型:
After the parameter,
n
, there’s a: int
annotation; the hint isn
should be an instance of theint
type.在参数
n
,有一个: int
注释; 提示n
应该是int
类型的实例。After the function parameter list, there’s a
-> str
annotation; the hint is the return value from this function should be an instance of thestr
type.在函数参数列表之后,有一个
-> str
注释; 提示此函数的返回值应该是str
类型的实例。
While some aspects of type hinting can be more complex, this is the essential model for the ways annotations are used. Provide a hint for each parameter to a function or method, and the result of each function or method. This seems to cover a multitude of cases elegantly.
尽管类型提示的某些方面可能会更复杂,但这是使用注释方式的基本模型。 为函数或方法的每个参数以及每个函数或方法的结果提供提示。 这似乎涵盖了许多案例。
As noted above, the algorithm has a bug. And the type hints also have a problem.
如上所述,该算法存在错误。 而且类型提示也有问题。
Let’s tackle the hints first. We can run mypy
on the src
directory and see the following:
让我们先解决提示。 我们可以在src
目录上运行mypy
并查看以下内容:
% mypy src
src/fizzbuzz2.py:4: error: Incompatible return value type (got "int", expected "str")
Found 1 error in 1 file (checked 2 source files)
Some folks spotted this conflict right away. The type annotation said the fizz_buzz()
function returned a str
. But. One of the return
statements returned an integer value.
有些人立即发现了这一冲突。 类型注释表示fizz_buzz()
函数返回了str
。 但。 其中一个return
语句返回一个整数值。
This leads to an interesting dilemma. We can ask “which one is right?” When the code and the annotations conflict, we have two paths forward, depending on our intent as designers of this software:
这导致了一个有趣的困境。 我们可以问“哪个是对的?” 当代码和注释冲突时,我们有两条前进的道路,这取决于我们作为软件设计者的意图:
The annotation is right: fix the code to match the annotation.
注释正确:修复代码以匹配注释。
The code is right: fix the annotation to match the code.
代码正确:修复注释以使其与代码匹配。
This dilemma happens a lot. A real lot. More-or-less constantly, in my experience.
这种困境经常发生。 真的很多。 根据我的经验,这种变化或多或少是不断的。
Let’s look at option 1 — The annotation is right: fix the code to match the annotation.The annotation was right all along, but the code didn’t quite implement it correctly. Here’s what the code should be:
让我们看一下选项1 —注释正确:修复代码以匹配注释。 注释一直以来都是正确的,但是代码没有完全正确地实现它。 下面的代码应该是什么:
def fizz_buzz(n: int) -> str:
if n % 3 == 0:
return "Fizz"
elif n % 5 == 0:
return "Buzz"
else:
return f"{n}"
We’ve fixed the return statements to create strings, consistent with the annotation. I’m partial to this path but it’s not always the right thing to do.
我们已修复return语句以创建与注释一致的字符串。 我偏爱这条路,但这并不总是正确的做法。
Let’s look at option 2 — The code is right: fix the annotation to match the code.The annotation didn’t properly reflect the code. Here’s what the annotation should be:
让我们看一下选项2 —代码正确:修复批注以使其与代码匹配。 注释未正确反映代码。 注释应为以下内容:
from typing import Uniondef fizz_buzz(n: int) -> Union[str, int]:
if n % 3 == 0:
return "Fizz"
elif n % 5 == 0:
return "Buzz"
else:
return n
This introduces a new type constructor, the Union
. This builds a composite type where the objects can be either a string or an integer. This describes the results of this function’s implementation.
这引入了一个新的类型构造函数Union
。 这将构建一个复合类型,其中对象可以是字符串或整数。 这描述了该功能实现的结果。
Python relies on Duck Typing: “If it looks like a duck, swims like a duck and quacks like a duck then it probably is a duck.” This means that most Python code is generic with respect to type, and many functions can be described as working with unions of a large number of types.
Python依赖于Duck Typing :“ 如果它看起来像鸭子,像鸭子一样游泳,像鸭子一样嘎嘎叫,那么它很可能就是鸭子。” 这意味着大多数Python代码在类型方面都是通用的,许多功能可以描述为使用大量类型的并集。
As a practical matter, our application code tends to be biased toward one or a few types. To clarify our intent, we often want to narrow the domain of possible types, and focus on one that really matters. To an extent, we use type annotations to intentionally set aside the way Python is capable of handling a large number of types so we can focus on just a few types that are relevant to our application.
实际上,我们的应用程序代码倾向于偏向一种或几种类型。 为了阐明我们的意图,我们经常想缩小可能类型的范围,而专注于真正重要的一种。 在某种程度上,我们使用类型注释有意地搁置了Python处理大量类型的能力,因此我们可以只关注与应用程序相关的几种类型。
Sailors make a lot of these nuanced distinctions when we lay hands on the various lines around the boat. We distinguish between sheets, halyards, reefing lines, dock lines, and ground tackle. Yes, they’re all more-or-less cordage of various sizes and colors. In the case of ground tackle, the anchor line may also have a patina of mud. Each line has a specific and often fixed application. When turning the boat, for example, we may need to ease the sheets. Easing the halyards while turning will create havoc. And easing the docklines is pointless.
当我们把手放在船上的各种线条上时,水手会做出许多细微的差别。 我们区分床单,系绳,收帆线,码头线和地面铲球。 是的,它们或多或少都是各种尺寸和颜色的绳索。 在使用地面铲球的情况下,锚固线也可能有泥泞的古铜色。 每行都有一个特定的且通常是固定的应用程序。 例如,在转船时,我们可能需要放松床单。 在转弯时缓和绳束会造成严重破坏。 放宽码头线毫无意义。
In the same way, we often want to carefully distinguish the data types permitted by a function in our applications. For this reason, I suggest avoiding the complexities of Union
definitions and fixing the function to work with a narrower definition, def fizz_buzz(n: int) -> str:.
同样,我们经常希望仔细区分应用程序中某个函数所允许的数据类型。 由于这个原因,我建议避免Union
定义的复杂性,并修复该函数以使用更狭窄的定义def fizz_buzz(n: int) -> str:.
This doesn’t uncover the algorithmic problem. It can’t, really, as we’ve failed to account for the number 15. It’s both “Fizz” and “Buzz.” I think this distinction between bugs that can only be found with unit tests, and potential bugs found by mypy, is essential. To solve this problem we need a number of tools, including mypy.
这并没有发现算法问题。 确实不能,因为我们未能说明数字15。它既是“嘶嘶声”又是“嗡嗡声”。 我认为,区分只有单元测试才能发现的错误和mypy发现的潜在错误之间的区别非常重要。 为了解决这个问题,我们需要许多工具,包括mypy。
Python的内置数据结构—过度设计的解决方案 (Python’s Built-in Data Structures — An Over-Engineered Solution)
Let’s do some over-engineering, shall we?
让我们做一些过度设计吧?
Instead of simply printing the number, Fizz, or Buzz, we want to accumulate a mapping between the number and a set of strings. We’re looking to create something like the following:
我们要累积数字和一组字符串之间的映射,而不是简单地打印数字,Fizz或Buzz。 我们正在寻找创建类似以下内容的东西:
{1: set(), 2: set(), 3: {"Fizz"}, 4: set(), 5: {"Buzz"}, ...}
Ideally, we’ll also fix the bug in the algorithm bug and have 15: {“Fizz”, “Buzz”}
in the result.
理想情况下,我们还将修复算法错误中的错误,并在结果中包含15: {“Fizz”, “Buzz”}
。
This requires some additional type constructors. The typing module includes List, Set,
and Dict
definitions we can use. Overall this is a dictionary that maps integers to sets.
这需要一些其他的类型构造函数。 键入模块包括我们可以使用的List, Set,
和Dict
定义。 总的来说,这是将整数映射到集合的字典。
We can start with Dict[int, Set]
to describe this. Pragmatically, it’s really Set[str]
, since the set will only contain strings (or be empty.). This leads us to a function signature that looks like this:
我们可以从Dict[int, Set]
来描述这一点。 实用上来说,它实际上是Set[str]
,因为set仅包含字符串(或为空)。 这将导致我们找到一个如下所示的函数签名:
def fizz_buzz(n: int) -> Set[str]...
We can then use this function to build the mapping, Dict[int, Set[str]].
然后,我们可以使用此函数来构建映射Dict[int, Set[str]].
在航行之前,请随意沉浮并编写自己的文章。 (Before sailing on, feel free to heave to and write your own.)
Here’s my solution:
这是我的解决方案:
def fizzy(n: int) -> Set[str]:
if n % 3 == 0:
return {"Fizz"}
return set()
def buzzy(n: int) -> Set[str]:
if n % 5 == 0:
return {"Buzz"}
return set()
def fizz_buzz(n: int) -> Set[str]:
return fizzy(n) | buzzy(n)
if __name__ == "__main__":
fb_map = {n: fizz_buzz(n) for n in range(10)}
for n in fb_map:
print(fb_map[n])
Each function is consistent. They all accept an integer parameter and create a result that is a proper Set[str]
. The final mapping uses a dictionary comprehension to create a mapping from the integer to the results of the fizz_buzz(n)
function.
每个功能是一致的。 它们都接受一个整数参数,并创建一个适当的Set[str]
。 最终映射使用字典理解来创建从整数到fizz_buzz(n)
函数结果的映射。
When we run mypy on this file, we’ll find that mypy has a question about the fb_map
assignment. While we — as authors of code — are pretty sure the mapping can be described as Dict[int, Set[str]]
, mypy is not delighted with leaping to this conclusion.
在此文件上运行mypy时,我们会发现mypy对fb_map
分配有疑问。 尽管我们(作为代码的编写者)非常确定可以将映射描述为Dict[int, Set[str]]
,但是mypy并不满意跳转到此结论。
We need another bit of type annotation machinery.
我们需要另一种类型注释机制。
fb_map: Dict[int, Set[str]] = {
n: fizz_buzz(n) for n in range(10)}
We’ve put a : Dict[int, Set[str]]
into the assignment statement between variable and =
.
我们在变量和=
之间的赋值语句中添加了: Dict[int, Set[str]]
。
This clarifies the intent of the dictionary comprehension. It gives mypy enough leverage to make a conclusion on whether or not all the functions are consistent.
这阐明了词典理解的意图。 它给mypy带来了足够的杠杆作用,可以得出所有功能是否一致的结论。
这样不是很复杂吗? (Isn’t this kind of complicated?)
The fb_map: Dict[int, Set[str]]
assignment statement is kind of complex with a type annotation buried in an already complex statement. Can we simplify this?
fb_map: Dict[int, Set[str]]
赋值语句很复杂,在已经很复杂的语句中埋藏了类型注释。 我们可以简化一下吗?
Spoiler alert: Yes.
扰流板警报: 是的。
One thing we can do is use type construction to break the complex definition down into simpler parts.
我们可以做的一件事是使用类型构造将复杂的定义分解为更简单的部分。
FBMap = Dict[int, Set[str]]
fb_map: FBMap = {n: fizz_buzz(n) for n in range(10)}
This shows how we can build a new type annotation and give it a name, FBMap
. This name lets us simplify the assignment statement, by using only the FBMap
type name rather than the long type expression.
这显示了我们如何构建新的类型注释并为其命名为FBMap
。 该名称使我们通过仅使用FBMap
类型名称而不是长类型表达式来简化赋值语句。
While this is simpler, there’s some repetition here that’s undesirable. We repeat Set[str]
a lot of times. Do we need to?
虽然这很简单,但是这里有些重复是不希望的。 我们重复Set[str]
很多次。 我们需要吗?
Spoiler alert: No.
扰流板警报: 否。
Consider this decomposition of the mapping type.
考虑这种映射类型的分解。
FzBzState = Set[str]
FBMap = Dict[int, FzBzState]
We’ve assigned the complex Set[str]
to a single name, FzBzState
. This isn’t a huge simplification in this case. But, Python lets us build very complex structures that we might want to simplify. Think of a list of tuples of strings and tuples of integers, or something equally bewildering. Because Python permits a lot of complexity, it can help to decompose these complex, bewildering things into some separate definitions.
我们已经将复杂的Set[str]
分配给单个名称FzBzState
。 在这种情况下,这并不是一个很大的简化。 但是,Python使我们可以构建非常复杂的结构,可能需要简化。 想一想字符串元组和整数元组的列表,或者同样令人困惑的东西。 因为Python允许很多复杂性,所以它可以帮助将这些复杂的,令人困惑的事情分解为一些单独的定义。
This leads to a further rethinking. Rather than provide all the code, I’ll summarize:
这导致了进一步的重新思考。 我将不提供所有代码,而是总结一下:
FzBzState = Set[str]
def fizzy(n: int) -> FzBzState: ...
def buzzy(n: int) -> FzBzState: ...
def fizz_buzz(n: int) -> FzBzState: ...
FBMap = Dict[int, FzBzState]
The intent here is to make sure it’s clear that all of the functions create a FzBzState
object, and the final mapping object will include an integer and a FzBzState
object. We can see that — for this specific implementation — FzBzState
is a Set[str]
. Having this consistent name makes it possible to consider changing the underlying type, to improve performance or provide a more expressive definition of the objects.
目的是确保所有函数都清楚地创建一个FzBzState
对象,并且最终的映射对象将包括一个整数和一个FzBzState
对象。 我们可以看到-对于这个特定的实现FzBzState
是一个Set[str]
。 具有此一致的名称可以考虑更改基础类型,提高性能或提供对象的更具表达性的定义。
The idea of breaking complex types down into simpler types is very appealing. When we’re facing a complex problem in boat maintenance, it helps to decompose the complex problem into simpler problems we can solve in isolation.
将复杂类型分解为简单类型的想法非常吸引人。 当我们在船舶维修中面临一个复杂的问题时,它有助于将复杂的问题分解为我们可以单独解决的更简单的问题。
For example, here’s a picture of a too complex bit of plumbing. It’s not clear, but five separate hoses converge through a complex nest of fittings. This needs to be simplified, because if there’s a failure somewhere, this could be very difficult to deal with.
例如,这是一张不太复杂的管道图。 尚不清楚,但是五根分开的软管汇聚成一排复杂的配件。 这需要简化,因为如果某处发生故障,则可能很难解决。
前向引用和循环性-真正过度设计的解决方案 (Forward References and Circularity — The Really Over-Engineered Solution)
Let’s really over-engineer the fizz-buzz problem by introducing class definitions. And not just any old class definitions. Let’s introduce mutually interdependent class definitions.
让我们通过引入类定义来真正解决fizz-buzz问题。 不仅仅是任何旧的类定义。 让我们介绍相互依赖的类定义。
We’ll break the fizziness or buzziness of a collection of numbers into two parts:
我们将数字集合的模糊性或嗡嗡声分为两部分:
- A class to define the properties of a given number. 定义给定数字属性的类。
- A collection of those individual number property definitions. 这些单个数字属性定义的集合。
Spoiler Alert: There are problems in the following code.
剧透警报: 以下代码中有问题。
The FBStatus class definition describes a single number and starts like this:
FBStatus类定义描述了一个数字,并从以下开始:
class FBStatus:
def __init__(self, n: int, parent: FBMap) -> None:
self.n = n
self.parent = parent
self.fb = str()
If you haven’t worked much with type annotations, the __init__()
method must return None
. We provide the overall map as part of each individual number’s status. I don’t have a brilliant reason why this relationship is essential, but this circularity is a common pattern in complex data structures where navigation of the graph can work “up” as well as “down” the structure.
如果您对类型注释的使用不多,则__init__()
方法必须返回None
。 我们提供整体地图,作为每个个人电话号码状态的一部分。 我没有很好的理由说明这种关系是否必要,但是这种圆度是复杂数据结构中的常见模式,在该结构中,图形的导航可以“向上”或“向下”进行。
More code is required to properly load the values into the self.fb
set. We’ll come back to this later, once we get the essential class definitions squared away.
需要更多代码才能将值正确加载到self.fb
集中。 一旦我们弄清了基本的类定义,我们将在稍后再讨论。
The FBMap
class definition describes a collection of numbers and looks like this:
FBMap
类定义描述了一个数字集合,如下所示:
class FBMap:
def __init__(self. limit: int) -> None:
self.domain = {
n: FBStatus(n, self)
for n in range(limit)
}
The initialization of the mapping creates a dictionary to map integers to FBStatus
instances.
映射的初始化创建了一个字典,用于将整数映射到FBStatus
实例。
This has a nuanced problem. And a few other less subtle problems.
这有一个细微的问题。 还有其他一些不太细微的问题。
Inside a method’s body we can refer to any object that will be part of the local or global namespace. Method body evaluation happens after all the functions and classes have been defined. This means Python function and class definitions can, generally, be in any order. We often chose an order to help explain the code.
在方法主体内部,我们可以引用将成为局部或全局命名空间一部分的任何对象。 在定义所有功能和类之后,将进行方法主体评估。 这意味着Python函数和类定义通常可以按任何顺序排列。 我们经常选择一个命令来帮助解释代码。
Outside a method’s body (i.e., in the definition) In the definition line we can only have references to names previously defined in the module. This constrains the order for definitions so that a function or method definition can only refer to previously defined classes or functions.
在方法主体之外(即在定义中)在定义行中,我们只能引用先前在模块中定义的名称。 这限制了定义的顺序,因此函数或方法定义只能引用先前定义的类或函数。
Mypy, however, gives us a way to break this definition order rule. We can provide a string instead of a class name. Mypy will resolve the strings, and this will let us include forward references. Here’s a small change that lets us define FBStatus
first with a forward reference to FBMap
.
但是,Mypy提供了一种打破此定义顺序规则的方法。 我们可以提供一个字符串而不是一个类名。 Mypy将解析字符串,这将使我们包括正向引用。 这是一个很小的更改,使我们可以先定义对FBStatus
的正向引用来FBMap
。
class FBStatus:
def __init__(self, n: int, parent: "FBMap") -> None:
self.n = n
self.parent = parent
self.fb = set()
The change is minor. We replaced FBMap
with the string “FBMap”
. This isn’t all, though. Once we have this solved we can move on to the two other problems here.
更改很小。 我们用字符串“FBMap”
替换了FBMap
。 这还不是全部。 一旦解决了这个问题,我们就可以继续讨论另外两个问题。
第一个问题-self.parent (First Problem — self.parent)
The first problem in the __init__()
method is that self.parent really needs to be a weakref
. That’s outside the scope of the type hint topic, but it’s helpful to use weakref.ref(parent)
.
__init__()
方法中的第一个问题是self.parent确实需要成为weakref
。 这超出了类型提示主题的范围,但是使用weakref.ref(parent)
有所帮助。
第二个问题-设置self.fb设置元素 (Second Problem — setting the self.fb set elements)
The second problem in the __init__()
method is we never set the value of self.fb
to anything useful. We want to create a set of fizz or buzz properties.
第二个问题 __init__()
方法是我们永远不会将self.fb
的值设置为任何有用的值。 我们要创建一组嘶嘶声或嗡嗡声属性。
Let’s solve both problems and finish the initialization of self.fb
:
让我们解决这两个问题并完成self.fb
的初始化:
class FBStatus:
def __init__(self, n: int, parent: "FBMap") -> None:
self.n = n
self.parent = weakref.ref(parent)
self.fb = set()
if n % 3 == 0: self.fb |= {"Fizz"}
if n % 5 == 0: self.fb |= {"Buzz"}
This shows how we’d like to build the self.fb
set as a union of several possible values. We can add “Fizz” to the set, or add “Buzz” to the set, or both, or leave it empty.
这显示了我们如何将self.fb
集构建为几个可能值的并集。 我们可以将“ Fizz”添加到集合中,或将“ Buzz”添加到集合中,或两者都添加,或将其保留为空。
第三个非真的问题—显示状态 (Third Not-Really-a-Problem — display the state)
While our class is pretty simple, it’s common to have properties or methods to expose the current state of an object. Let’s add one more feature: a property to extract a useful summary from each instance of this class. Here’s the full definition:
尽管我们的类非常简单,但是通常具有一些属性或方法来公开对象的当前状态。 让我们添加另一个功能:一个属性,用于从此类的每个实例中提取有用的摘要。 这是完整的定义:
class FBStatus:
def __init__(self, n: int, parent: "FBMap") -> None:
self.n = n
self.parent = weakref.ref(parent)
self.fb = set()
if n % 3 == 0: self.fb |= {"Fizz"}
if n % 5 == 0: self.fb |= {"Buzz"} @property
def fizz_buzz(self) -> Tuple[int, Set[str]]:
return self.n, self.fb
This property will give mypy fits. Why? We have a conflict:
此属性将使适合。 为什么? 我们有冲突:
The
fizz_buzz
property definition claimsself.fb
isSet[str]
.fizz_buzz
属性定义声明self.fb
是Set[str]
。The
__init__()
method claimsself.fb
isSet[Any]
.__init__()
方法声称self.fb
是Set[Any]
。
As we noted above, we’ve surfaced a conflict between the code and the hints. Often the code is wrong, but sometimes the hints are wrong. In this case, a little extra annotation will save the day.
如上所述,我们已经在代码和提示之间出现了冲突。 通常,代码是错误的,但有时提示是错误的。 在这种情况下,额外的注释会节省一天的时间。
One final look at the class definition for FBStatus
.
最后看一下FBStatus
的类定义。
class FBStatus:
def __init__(self, n: int, parent: "FBMap") -> None:
self.n = n
self.parent = weakref.ref(parent)
self.fb: Set[str] = set()
if n % 3 == 0: self.fb |= {"Fizz"}
if n % 5 == 0: self.fb |= {"Buzz"} @property
def fizz_buzz(self) -> Tuple[int, Set[str]]:
return self.n, self.fb
This definition of the FBStatus
class provides an important clue to mypy: the set will only contain strings. This additional definition resolves the conflict mypy saw between how self.fb
was created initially, and how it was used in the fizz_buzz
property.
FBStatus
类的此定义为FBStatus
提供了重要线索:该集合将仅包含字符串。 此附加定义解决了mypy在最初创建self.fb
以及如何在fizz_buzz
属性中使用它之间看到的冲突。
The “FBMap
” string as a forward reference annotation to the FBMap
type reminds me of the way “messenger lines” are used in boats. Threading a new halyard inside a 50-foot mast is tricky. There are other lines and electrical wires inside the mast. It’s a long aluminum tube, so we can’t see what we’re doing. However, sailors have a solution for this. We start by tying a light piece of line to the end of the old halyard. When we pull the halyard down, the light piece of line follows it around the various blocks through the dark recesses of the mast. We leave this messenger in place to mark the path. When it’s time to replace the halyard with a new, less chafed line, we bend the new halyard to the messenger, and use it to pull the heavy halyard through the dark recesses of the mast.
“ FBMap
”字符串作为FBMap
类型的前向引用注释,使我想起了“信使线”在船上的使用方式。 在50英尺长的桅杆中穿入新的船体是很棘手的。 桅杆内还有其他电线。 这是一个长铝管,所以我们看不到我们在做什么。 但是,水手们对此有解决方案。 我们首先将一条细线绑在旧船索的末端。 当我们将绳索拉下时,一条细线沿着它穿过桅杆的深色凹槽围绕各个块。 我们将这个Messenger留在原处以标记路径。 当需要用一条新的,没有磨损的绳子代替吊索时,我们将新的吊索弯曲到信使上,并用它将沉重的吊索穿过桅杆的黑暗凹处。
Using a string for a type hint is like the messenger line. The real type will be defined eventually. For now, there’s a lightweight placeholder.
使用字符串作为类型提示就像信使行。 实类型将最终定义。 现在,有一个轻量级的占位符。
你怎么知道的? (How Did You Know That?)
Sometimes, the errors from mypy can be confusing. For me, the most common cause for confusion is the mypy errors conflict with one of my closely-held assumptions about the code I’m working on. I thought I knew what I meant. Why can’t mypy see it, too?
有时,来自mypy的错误可能会造成混淆。 对我来说,最常见的混淆原因是mypy错误与我对我正在处理的代码的一个严密假设有关。 我以为我知道我的意思。 为什么Mypy也看不到它?
The primary tool for clarifying how wrong assumptions can be is the reveal_type()
“function”. This has the syntax of a function, but it isn’t a real function. It’s used by mypy to display details.
弄清错误假设的主要工具是reveal_type()
“ function”。 它具有函数的语法,但不是真正的函数。 mypy使用它来显示详细信息。
We might use it like this:
我们可以这样使用它:
class FBStatus:
def __init__(self, n: int, parent: "FBMap") -> None:
self.n = n
self.parent = weakref.ref(parent)
self.fb = set()
reveal_type(self.fb)
if n % 3 == 0: self.fb |= {"Fizz"}
if n % 5 == 0: self.fb |= {"Buzz"} @property
def fizz_buzz(self) -> Tuple[int, Set[str]]:
return self.n, self.fb
I’ve included a reveal_type(self.fb)
in this example to show how it would look. This is something you do before running mypy. It has to be removed, because you can’t even run unit tests with this in place. It can help us see what mypy is seeing about our code.
在此示例中,我包括了reveal_type(self.fb)
,以显示其外观。 这是在运行mypy之前要做的事情。 必须将其删除,因为您甚至无法在此位置运行单元测试。 它可以帮助我们了解mypy对我们的代码的看法。
It’s like spotting a buoy on a tricky river entrance. GPS is fun, and it’s nice to think you know where you are. Nothing beats seeing a big old green can buoy floating in the water, more-or-less where you hoped it would be. A physical landmark provides a lot of confidence that we’re sailing the safe, deep water.
这就像在一个棘手的河道入口上发现一个浮标。 GPS很有趣,很高兴您知道自己在哪里。 没有什么能比拟的了,看到一个大的旧绿色可以漂浮在水上,或多或少,就像您希望的那样。 物理地标为我们在安全,深水的航行中提供了很大的信心。
命名为元组-另一个真正过度设计的解决方案 (Named Tuples — Another Really Over-Engineered Solution)
I want to look at another possible over-engineered solution to the fizz-buzz problem. We’ll start by using typing.NamedTuple
instances to track fizziness and buzziness of a number. These can have type hints built-in and can be very useful for creating applications with reasonably complete annotations.
我想看看解决嘶嘶声问题的另一种过度设计的解决方案。 我们将从使用typing.NamedTuple
实例开始跟踪数字的模糊和嗡嗡声开始。 这些可以内置类型提示,对于创建带有合理完整注释的应用程序非常有用。
First, the new typing.NamedTuple
which is much cooler than the older collections.namedtuple
. Here’s an example:
首先,新的typing.NamedTuple
比旧的collections.namedtuple
凉爽得多。 这是一个例子:
from typing import NamedTuple, Optional, Setclass FB(NamedTuple):
n: int
fizz: Optional[str]
buzz: Optional[str]
This is a Python 3-tuple, with attribute names, n, fizz, and buzz, for the items in the tuple and — bonus! — type annotations for each of the items. This is a little nicer than an unnamed tuple of Tuple[int, Optional[str], Optional[str]] because each item has a proper attribute name. The big bonus is providing hints so mypy can examine the code carefully.
这是一个Python 3元组,具有元组中各项的属性名称,n,fizz和buzz,并且-奖金! —为每个项目键入注释。 这比未命名的Tuple [int,Optional [str],Optional [str]]元组好一点,因为每个项目都有一个正确的属性名称。 最大的好处是提供提示,以便mypy可以仔细检查代码。
The Optional[str]
is a handy way to describe a union of two types. This is equivalent to Union[str, None]
, and reflects a very common Python programming practice. This articulates the way None is commonly used as a placeholder when there’s no useful value.
Optional[str]
是描述两种类型的并集的便捷方法。 这等效于Union[str, None]
,反映了一种非常常见的Python编程实践。 这阐明了在没有有用价值时通常将None用作占位符的方式。
We can use code like this fb6 = FB(6, “Fizz”, None)
to define the FizzBuzz properties of a given number. We can use fb6[0]
or fb6.n
to get the value, fb6[1]
or fb6.fizz
to get the fizziness of the number. I’m a fan of named attributes instead of positional attributes.
我们可以使用像这样的代码fb6 = FB(6, “Fizz”, None)
来定义给定数字的FizzBuzz属性。 我们可以使用fb6[0]
或fb6.n
来获取值,使用fb6[1]
或fb6.fizz
来获取数字的模糊性。 我是命名属性而不是位置属性的粉丝。
How do we create these objects? We’ll need a factory of some kind. Here’s a suitable function:
我们如何创建这些对象? 我们需要某种工厂。 这是一个合适的函数:
def fizz_buzz(n: int) -> FB:
return FB(
n,
"Fizz" if n % 3 == 0 else None,
"Buzz" if n % 5 == 0 else None
)
This will build FB
tuples with a number and the appropriate properties. I like these kinds of solutions where we can use functions applied to immutable objects.
这将构建具有数量和适当属性的FB
元组。 我喜欢这些解决方案,在这些解决方案中,我们可以使用应用于不可变对象的函数。
This isn’t completely compatible with previous definitions, however. It doesn’t use a Set[str]
anywhere. We can add that as a property:
但是,这与以前的定义不完全兼容。 它不在任何地方使用Set[str]
。 我们可以将其添加为属性:
class FB(NamedTuple):
n: int
fizz: Optional[str]
buzz: Optional[str] @property
def as_set(self) -> Set[str]:
return {self.fizz, self.buzz} - {None}
I’ve added an as_set
property that will transform the fizz and buzz values into a set of strings. Note that the values of self.fizz
and self.buzz
are Optional[str]
. The union of str
and None
means they may have a None
value. We don’t really want to see a None object in the result set, so we explicitly remove it with set subtraction.
我添加了一个as_set
属性,该属性会将fizz和buzz值转换为一组字符串。 请注意, self.fizz
和self.buzz
的值是Optional[str]
。 str
和None
并集表示它们可能具有None
值。 我们实际上并不希望在结果集中看到None对象,因此我们通过设置减法将其显式删除。
This causes problems with mypy, however. The contents of the set appear to be Set[Optional[str]]
, which doesn’t match the return type of Set[str]
. The problem here is that mypy can’t figure out our algorithm for removing None
objects. A person can be convinced that there will be no None
objects in the resulting set, but mypy isn’t as clever.
但是,这会导致mypy问题。 该集合的内容似乎是Set[Optional[str]]
,与返回类型不匹配 Set[str]
。 这里的问题是mypy无法弄清楚我们删除None
对象的算法。 可以确信一个人,结果集中将没有None
对象,但是mypy并不那么聪明。
There are some places, where mypy can (and does) detect the shift from Optional[str]
to str
. These places almost always involve an explicit if
statement that’s easy to detect and reason about.
在某些地方,mypy可以(并且确实)检测从Optional[str]
到str
的转变。 这些地方几乎总是涉及易于检测和推理的显式if
语句。
Lacking an obvious if
statement, we’re forced to label the result with a type that we’re sure our algorithm produces. For this, we’ll need the typing.cast()
function:
缺少明显的if
语句,我们被迫使用我们确定算法会产生的类型来标记结果。 为此,我们需要typing.cast()
函数:
class FB2(NamedTuple):
n: int
fizz: Optional[str]
buzz: Optional[str] @property
def as_set(self) -> Set[str]:
return cast(Set[str], {self.fizz, self.buzz} - {None})
The use of cast(Set[str], …)
tells mypy that the expression really does remove None
values, building a Set[str]
from what appeared to be Set[Optional[str]]
.
使用cast(Set[str], …)
告诉mypy表达式确实删除了None
值,并从看起来像Set[Optional[str]]
构建了Set[str]
Set[Optional[str]]
。
The cast()
function is essentially an annotation with no run-time consequence. It can help to clarify expressions that aren’t perfectly obvious to mypy.
cast()
函数本质上是一个没有运行时后果的注释。 它可以帮助澄清对mypy而言并非十分明显的表达式。
For me, named tuples are like using color-coded sheets to control the sails. Rather than ask a guest to cast off the starboard staysail sheet, it’s a lot easier to ask them to pick up the thinner, green ropey thing. The fatter, green ropey thing is the starboard yankee sheet, and isn’t being used right now. And the two red ropey things are port side sheets, we’ll be using those after we tack.
对我来说,命名元组就像使用颜色编码的表来控制帆。 与其让客人抛下右舷的帆板,不如让他们拿起较薄的绿色绳索状物品要容易得多。 较胖的,绿色的绳状东西是右舷洋基薄片,目前不使用。 而且这两个红色的绳状物品是港口侧板,我们在粘好之后将使用它们。
The complexity of my boat means we’re not really sailing unless we have six different sheets all piled up in the cockpit. As with the attributes of a named tuple, it’s important to have clear names for each one.
我船的复杂性意味着除非在驾驶舱中堆满了六张不同的床单,否则我们不会真正航行。 与命名元组的属性一样,为每个元组使用清晰的名称很重要。
使用数据类-也是真的,过度设计 (Using Dataclasses — Also Really, Really Over-Engineered)
The final example of over-engineering is creating instances of dataclasses. There’s a lot of flexibility here, and I’ll stick to one example that’s not too complicated.
过度设计的最后一个示例是创建数据类的实例。 这里有很多灵活性,我将坚持一个例子,它不太复杂。
from dataclasses import dataclass, field
from typing import Set@dataclass
class FB:
n: int
fizz_buzz: Set[str] = field(init=False)
def __post_init__(self) -> None:
self.fizz_buzz = set()
self.fizz_buzz |= {"Fizz"} if self.n % 3 == 0 else set()
self.fizz_buzz |= {"Buzz"} if self.n % 5 == 0 else set()
This FB
class lets us create objects using code like [FB(i) for i in range(15)]
. We’ve set this up to show some of the many kinds of initialization alternatives available to dataclasses:
这个FB
类使我们可以使用[FB(i) for i in range(15)]
等代码创建对象。 我们将其设置为显示可用于数据类的多种初始化选择中的一些:
The default case is shown by the
self.n
attribute. The core__init__()
processing is built for us; this will set the value of theself.n
attribute when we useFB(i)
.默认情况由
self.n
属性显示。 核心__init__()
处理是为我们构建的。 当我们使用FB(i)
时,这将设置self.n
属性的值。A common alternative case is to provide a default value. The
self.fizz_buzz
attribute is not part of the__init__()
parameters because we used an assignment statement to provide a default value. Thefield(init=False)
is a very special kind of default, leaving this attribute unset. A kind of no-default default.一种常见的替代情况是提供默认值。
self.fizz_buzz
属性不是__init__()
参数的一部分,因为我们使用了赋值语句来提供默认值。field(init=False)
是一种非常特殊的默认值,未设置此属性。 一种非默认默认值。We’ve provided a
__post_init__()
method to set the value of theself.fizz_buzz
attribute after__init__()
has set theself.n
attribute value.我们提供了一个
__post_init__()
方法来设置的值self.fizz_buzz
属性后__init__()
设定了self.n
属性值。
The __post_init__()
method serves to provide a tidy encapsulation of the Fizz Buzz rules inside the class definition. The dataclass lets us provide type hints for the attributes of the class. And it lets us initialize those attributes a variety of ways.
__post_init__()
方法用于在类定义中提供整齐的Fizz Buzz规则封装。 数据类使我们能够为类的属性提供类型提示。 它使我们可以通过多种方式初始化这些属性。
The objects created by the FB
class are mutable, unlike the version created with typing.NamedTuple
, so a change to the n attribute can lead to an invalid object. Something pathological like:
由FB
类创建的对象是可变的,这与使用typing.NamedTuple
创建的版本不同,因此,对n属性的更改可能导致无效的对象。 病理性的东西:
>>> fzbz = FB(6)
>>> fzbz.n = 7
Is morally corrupt, but valid Python. If we want to be able to set the n
attribute, and have the value of the fizz_buzz
attribute change also, we will need to create settable properties. I’ll leave this as an exercise for the reader.
在道德上是腐败的,但是有效的Python。 如果我们希望能够设置n
属性,并且还更改fizz_buzz
属性的值,则需要创建可设置的属性。 我将其留给读者练习。
The flexible initialization alternatives make dataclasses very handy for creating stateful objects. We can provide type hints to reflect the domains of the attribute values. For me, the flexibility of data classes is like having two masts and five sails on a boat: no matter what conditions, there’s a combination of sails that will provide a safe, controlled ride. It takes some work to get the sails configured, but the number of choices available means the results will often be delightful.
灵活的初始化替代方案使数据类对于创建有状态对象非常方便。 我们可以提供类型提示以反映属性值的域。 对我来说,数据类的灵活性就像在一条船上有两个桅杆和五个帆:无论什么条件,帆的组合都会提供安全,可控的行驶。 进行风帆配置需要一些工作,但是可用的选择数量众多,结果往往是令人愉快的。
结论 (Conclusion)
Put more buzz in your fizz with type hints.
使用类型提示将更多嗡嗡声添加到您的嘶嘶声中。
There are a lot of ways software development can go wrong. We can misunderstand the users and their problem. We can misunderstand the data or the appropriate algorithm. We may have code that doesn’t quite match our intent, or perhaps, our intent is a little vague.
软件开发有很多方法可能出错。 我们可能会误解用户及其问题。 我们可能会误解数据或适当的算法。 我们的代码可能与我们的意图不太匹配,或者我们的意图有些模糊。
I’m a fan of thinking through the type hints and using this to inform the code and the unit tests. I think it’s helpful to clearly articulate what the structures need to be before trying to write the code that builds or consumes those data structures.
我喜欢思考类型提示,并以此来告知代码和单元测试。 我认为在尝试编写用于构建或使用这些数据结构的代码之前,先清楚阐明所需的结构是有帮助的。
I’ve had some conversation with people who wonder if type hints are effectively redundant; since unit tests confirm the software works, type hints aren’t telling us anything new. I reject the idea of redundancy as a problem.
我和一些想知道类型提示是否真正多余的人进行了交谈。 由于单元测试确认该软件可以正常工作,因此类型提示不会告诉我们任何新内容。 我不认为冗余是一个问题。
I submit that type hints do tell us something new. Python’s duck typing flexibility means a number of mistakes can pass a suite of unit tests that fail to test all the obscure edge cases. An expression like a+b
works for float, integer, string, list, tuple, and even bytes. While it’s common to test the expected types, we rarely test the unexpected types. The static analysis of mypy can help narrow the domain of types under consideration and provide assurance the tests cover all possible cases.
我认为类型提示确实告诉我们一些新东西。 Python的鸭子输入灵活性意味着许多错误可以通过一套单元测试,而这些单元测试无法测试所有晦涩难懂的案例。 像a+b
这样的表达式 适用于浮点数,整数,字符串,列表,元组和偶数字节。 虽然测试期望的类型很常见,但是我们很少测试意外的类型。 对mypy的静态分析可以帮助缩小所考虑类型的范围,并确保测试覆盖所有可能的情况。
I can’t say anything bad about having multiple tools for the same job. My boat has four separate pumps to drain water from the bilge — two electric and two manual. When the consequences of a problem are dire, it seems sensible to have multiple tools, each with a distinct focus. If I’m going to write high-quality software, I want to use as many tools as possible to be sure things work the way I expect.
对于同一个工作拥有多个工具,我不能说不好。 我的船上有四个单独的泵,用于从舱底排水—两个电动泵和两个手动泵。 当问题的后果非常严峻时,明智的做法是拥有多个工具,每个工具都有各自的重点。 如果我要编写高质量的软件,我想使用尽可能多的工具,以确保一切按我期望的方式工作。
Originally published at https://www.capitalone.com.
最初在 https://www.capitalone.com上 发布 。
DISCLOSURE STATEMENT: © 2020 Capital One. Opinions are those of the individual author. Unless noted otherwise in this post, Capital One is not affiliated with, nor endorsed by, any of the companies mentioned. All trademarks and other intellectual property used or displayed are property of their respective owners.
披露声明:©2020 Capital One。 观点是个别作者的观点。 除非本文中另有说明,否则Capital One不与任何提及的公司有附属关系或认可。 使用或显示的所有商标和其他知识产权均为其各自所有者的财产。
翻译自: https://medium.com/capital-one-tech/putting-more-buzz-in-your-python-fizz-f93f5ca7584c