Python 3.7中的数据类终极指南

One new and exciting feature coming in Python 3.7 is the data class. A data class is a class typically containing mainly data, although there aren’t really any restrictions. It is created using the new @dataclass decorator, as follows:

数据类是Python 3.7中的一项令人兴奋的新功能。 数据类是通常通常主要包含数据的类,尽管实际上并没有任何限制。 它是使用新的@dataclass装饰器创建的,如下所示:

 from from dataclasses dataclasses import import dataclass

dataclass

@dataclass
@dataclass
class class DataClassCardDataClassCard :
    :
    rankrank : : str
    str
    suitsuit : : str
str

Note: This code, as well as all other examples in this tutorial, will only work in Python 3.7 and above.

注意:此代码以及本教程中的所有其他示例仅在Python 3.7及更高版本中有效。

A data class comes with basic functionality already implemented. For instance, you can instantiate, print, and compare data class instances straight out of the box:

数据类具有已实现的基本功能。 例如,您可以直接实例化,打印和比较数据类实例:

Compare that to a regular class. A minimal regular class would look something like this:

将其与普通班进行比较。 最小的常规类如下所示:

 class class RegularCard
    RegularCard
    def def __init____init__ (( selfself , , rankrank , , suitsuit ):
        ):
        selfself .. rank rank = = rank
        rank
        selfself .. suit suit = = suit
suit

While this is not much more code to write, you can already see signs of the boilerplate pain: rank and suit are both repeated three times simply to initialize an object. Furthermore, if you try to use this plain class, you’ll notice that the representation of the objects is not very descriptive, and for some reason a queen of hearts is not the same as a queen of hearts:

尽管编写的代码并不多,但您已经可以看到样板式的痕迹:仅仅为了初始化一个对象, ranksuit都重复了三遍。 此外,如果尝试使用此简单类,您会注意到对象的表示形式不是很具有描述性,由于某种原因,红心皇后与红心皇后并不相同:

Seems like data classes are helping us out behind the scenes. By default, data classes implement a .__repr__() method to provide a nice string representation and an .__eq__() method that can do basic object comparisons. For the RegularCard class to imitate the data class above, you need to add these methods as well:

似乎数据类正在帮助我们在幕后发展。 默认情况下,数据类实现.__repr__()方法以提供漂亮的字符串表示形式,并实现可以进行基本对象比较的.__eq__()方法。 为了使RegularCard类模仿上面的数据类,还需要添加以下方法:

 class class RegularCard
    RegularCard
    def def __init____init__ (( selfself , , rankrank , , suitsuit ):
        ):
        selfself .. rank rank = = rank
        rank
        selfself .. suit suit = = suit

    suit

    def def __repr____repr__ (( selfself ):
        ):
        return return (( ff '{self.__class__.__name__}'
                '{self.__class__.__name__}'
                ff '(rank={self.rank!r}, suit={self.suit!r})''(rank={self.rank!r}, suit={self.suit!r})' )

    )

    def def __eq____eq__ (( selfself , , otherother ):
        ):
        if if otherother .. __class__ __class__ is is not not selfself .. __class____class__ :
            :
            return return NotImplemented
        NotImplemented
        return return (( selfself .. rankrank , , selfself .. suitsuit ) ) == == (( otherother .. rankrank , , otherother .. suitsuit )
)

In this tutorial, you will learn exactly which conveniences data classes provide. In addition to nice representations and comparisons, you’ll see:

在本教程中,您将确切学习数据类提供的便利。 除了很好的表示和比较,您还将看到:

  • How to add default values to data class fields
  • How data classes allow for ordering of objects
  • How to represent immutable data
  • How data classes handle inheritance
  • 如何将默认值添加到数据类字段
  • 数据类如何允许对象排序
  • 如何表示不可变数据
  • 数据类如何处理继承

We will soon dive deeper into those features of data classes. However, you might be thinking that you have already seen something like this before.

我们将很快深入研究数据类的那些功能。 但是,您可能会认为您之前已经看过类似的东西。

Free Bonus: Click here to get access to a chapter from Python Tricks: The Book that shows you Python’s best practices with simple examples you can apply instantly to write more beautiful + Pythonic code.

免费红利: 单击此处可访问Python技巧的一章:该书通过简单的示例向您展示了Python的最佳实践,您可以立即应用这些示例编写更精美的Pythonic代码。

数据类的替代 (Alternatives to Data Classes)

For simple data structures, you have probably already used a tuple or a dict. You could represent the queen of hearts card in either of the following ways:

对于简单的数据结构,您可能已经使用tupledict 。 您可以通过以下两种方式代表红心皇后卡:

It works. However, it puts a lot of responsibility on you as a programmer:

有用。 但是,作为程序员,这给您带来了很多责任:

  • You need to remember that the queen_of_hearts_... variable represents a card.
  • For the _tuple version, you need to remember the order of the attributes. Writing ('Spades', 'A') will mess up your program but probably not give you an easily understandable error message.
  • If you use the _dict, you must make sure the names of the attributes are consistent. For instance {'value': 'A', 'suit': 'Spades'} will not work as expected.
  • 您需要记住, queen_of_hearts_...变量代表一张牌。
  • 对于_tuple版本,您需要记住属性的顺序。 编写('Spades', 'A')会使您的程序混乱,但可能不会给您提供易于理解的错误消息。
  • 如果使用_dict ,则必须确保属性名称是一致的。 例如{'value': 'A', 'suit': 'Spades'}将无法正常工作。

Furthermore, using these structures is not ideal:

此外,使用这些结构并不理想:

 >>> >>>  queen_of_hearts_tuplequeen_of_hearts_tuple [[ 00 ]  ]  # No named access
# No named access
'Q'
'Q'
>>> >>>  queen_of_hearts_dictqueen_of_hearts_dict [[ 'suit''suit' ]  ]  # Would be nicer with .suit
# Would be nicer with .suit
'Hearts'
'Hearts'

A better alternative is the namedtuple. It has long been used to create readable small data structures. We can in fact recreate the data class example above using a namedtuple like this:

更好的替代方法是namedtuple 。 长期以来,它一直被用来创建可读的小型数据结构。 实际上,我们可以使用namedtuple重新创建上面的数据类示例,如下所示:

This definition of NamedTupleCard will give the exact same output as our DataClassCard example did:

NamedTupleCard此定义将提供与我们的DataClassCard示例完全相同的输出:

 >>> >>>  queen_of_hearts queen_of_hearts = = NamedTupleCardNamedTupleCard (( 'Q''Q' , , 'Hearts''Hearts' )
)
>>> >>>  queen_of_heartsqueen_of_hearts .. rank
rank
'Q'
'Q'
>>> >>>  queen_of_hearts
queen_of_hearts
NamedTupleCard(rank='Q', suit='Hearts')
NamedTupleCard(rank='Q', suit='Hearts')
>>> >>>  queen_of_hearts queen_of_hearts == == NamedTupleCardNamedTupleCard (( 'Q''Q' , , 'Hearts''Hearts' )
)
True
True

So why even bother with data classes? First of all, data classes come with many more features than you have seen so far. At the same time, the namedtuple has some other features that are not necessarily desireable. By design, a namedtuple is a regular tuple. This can be seen in comparisons, for instance:

那么,为什么还要打扰数据类呢? 首先,数据类具有比到目前为止所见更多的功能。 同时, namedtuple还具有其他一些不必要的功能。 通过设计,一个namedtuple是一个常规元组。 可以在比较中看到,例如:

While this might seem like a good thing, this lack of awareness about its own type can lead to subtle and hard-to-find bugs, especially since it will also happily compare two different namedtuple classes:

尽管这看起来似乎是一件好事,但对自身类型的认识不足可能会导致细微而难以发现的错误,尤其是因为它还会愉快地比较两个不同的namedtuple类:

 >>> >>>  Person Person = = namedtuplenamedtuple (( 'Person''Person' , , [[ 'first_initial''first_initial' , , 'last_name''last_name' ]
]
>>> >>>  ace_of_spades ace_of_spades = = NamedTupleCardNamedTupleCard (( 'A''A' , , 'Spades''Spades' )
)
>>> >>>  ace_of_spades ace_of_spades == == PersonPerson (( 'A''A' , , 'Spades''Spades' )
)
True
True

The namedtuple also comes with some restrictions. For instance, it is hard to add default values to some of the fields in a namedtuple. A namedtuple is also by nature immutable. That is, the value of a namedtuple can never change. In some applications, this is an awesome feature, but in other settings, it would be nice to have more flexibility:

namedtuple也有一些限制。 例如,很难将默认值添加到namedtuple某些字段。 namedtuple本质上也是不可变的。 也就是说, namedtuple的值永远不会改变。 在某些应用程序中,这是一个了不起的功能,但是在其他设置中,具有更大的灵活性将是一件很不错的事情:

Data classes will not replace all uses of namedtuple. For instance, if you need your data structure to behave like a tuple, then a named tuple is a great alternative!

数据类不会取代namedtuple所有用法。 例如,如果您需要数据结构表现得像元组,那么命名元组是一个很好的选择!

Another alternative, and one of the inspirations for data classes, is the attrs project. With attrs installed (pip install attrs), you can write a card class as follows:

attrs项目是另一个替代方法,也是数据类的灵感之一。 随着attrs安装( pip install attrs ),可以按如下方式写卡类:

 import import attr

attr

@attr.s
@attr.s
class class AttrsCardAttrsCard :
    :
    rank rank = = attrattr .. ibib ()
    ()
    suit suit = = attrattr .. ibib ()
()

This can be used in exactly the same way as the DataClassCard and NamedTupleCard examples earlier. The attrs project is great and does support some features that data classes do not, including converters and validators. Furthermore, attrs has been around for a while and is supported in Python 2.7 as well as Python 3.4 and up. However, as attrs is not a part of the standard library, it does add an external dependency to your projects. Through data classes, similar functionality will be available everywhere.

可以使用与前面的DataClassCardNamedTupleCard示例完全相同的方式使用DataClassCardattrs项目很棒,它确实支持数据类所不具备的某些功能,包括转换器和验证器。 此外, attrs已经存在了一段时间,并且在Python 2.7和Python 3.4及更高版本中受支持。 但是,由于attrs不是标准库的一部分,因此它确实向项目添加了外部依赖关系。 通过数据类,类似的功能将随处可用。

In addition to tuple, dict, namedtuple, and attrs, there are many other similar projects, including typing.NamedTuple, namedlist, attrdict, plumber, and fields. While data classes are a great new alternative, there are still use cases where one of the older variants fits better. For instance, if you need compatibility with a specific API expecting tuples or need functionality not supported in data classes.

除了tupledictnamedtupleattrs ,还有许多其他类似的项目 ,包括typing.NamedTuplenamedlistattrdictplumberfields 。 尽管数据类是一个很好的新选择,但是在某些用例中,较旧的变体更适合。 例如,如果您需要与期望元组的特定API兼容,或者需要数据类中不支持的功能。

基本数据类别 (Basic Data Classes)

Let us get back to data classes. As an example, we will create a Position class that will represent geographic positions with a name as well as the latitude and longitude:

让我们回到数据类。 作为示例,我们将创建一个Position类,该类将使用名称以及纬度和经度来表示地理位置:

What makes this a data class is the @dataclass decorator just above the class definition. Beneath the class Position: line, you simply list the fields you want in your data class. The : notation used for the fields is using a new feature in Python 3.6 called variable annotations. We will soon talk more about this notation and why we specify data types like str and float.

使它成为数据类的是类定义上方的@dataclass装饰器 。 在“ class Position:class Position:下方,您只需列出数据类中所需的字段即可。 用于字段的:表示法正在Python 3.6中使用一项称为变量注释的新功能。 我们将在不久的将来更多地讨论这种表示法以及为什么我们指定诸如strfloat类的数据类型。

Those few lines of code are all you need. The new class is ready for use:

只需几行代码即可。 新类可以使用了:

 >>> >>>  pos pos = = PositionPosition (( 'Oslo''Oslo' , , 10.810.8 , , 59.959.9 )
)
>>> >>>  printprint (( pospos )
)
Position(name='Oslo', lon=10.8, lat=59.9)
Position(name='Oslo', lon=10.8, lat=59.9)
>>> >>>  pospos .. lat
lat
59.9
59.9
>>> >>>  printprint (( ff '{pos.name} is at {pos.lat}°N, {pos.lon}°E''{pos.name} is at {pos.lat}°N, {pos.lon}°E' )
)
Oslo is at 59.9°N, 10.8°E
Oslo is at 59.9°N, 10.8°E

You can also create data classes similarly to how named tuples are created. The following is (almost) equivalent to the definition of Position above:

您也可以类似于创建命名元组的方式创建数据类。 以下内容(几乎)等同于以上“ Position的定义:

A data class is a regular Python class. The only thing that sets it apart is that it has basic data model methods like .__init__(), .__repr__(), and .__eq__() implemented for you.

数据类是常规的Python类。 唯一使它与众不同的是,它具有为您实现的基本数据模型方法,.__init__().__repr__().__eq__()

默认值 (Default Values)

It is easy to add default values to the fields of your data class:

将默认值添加到数据类的字段很容易:

 from from dataclasses dataclasses import import dataclass

dataclass

@dataclass
@dataclass
class class PositionPosition :
    :
    namename : : str
    str
    lonlon : : float float = = 0.0
    0.0
    latlat : : float float = = 0.0
0.0

This works exactly as if you had specified the default values in the definition of the .__init__() method of a regular class:

就像您在常规类的.__init__()方法的定义中指定了默认值一样,它的工作原理相同:

Later you will learn about default_factory, which gives a way to provide more complicated default values.

稍后您将学习default_factory ,它提供了一种提供更复杂的默认值的方法。

类型提示 (Type Hints)

So far, we have not made a big fuss of the fact that data classes support typing out of the box. You have probably noticed that we defined the fields with a type hint: name: str says that name should be a text string (str type).

到目前为止,我们还没有做出的事实,数据类支持上大做文章打字的开箱即用。 您可能已经注意到,我们使用类型提示定义了字段: name: str表示name应为文本字符串( str类型)。

In fact, adding some kind of type hint is mandatory when defining the fields in your data class. Without a type hint, the field will not be a part of the data class. However, if you do not want to add explicit types to your data class, use typing.Any:

实际上,在定义数据类中的字段时,必须添加某种类型提示。 没有类型提示,该字段将不会成为数据类的一部分。 但是,如果您不想向数据类中添加显式类型,请使用typing.Any

 from from dataclasses dataclasses import import dataclass
dataclass
from from typing typing import import Any

Any

@dataclass
@dataclass
class class WithoutExplicitTypesWithoutExplicitTypes :
    :
    namename : : Any
    Any
    valuevalue : : Any Any = = 42
42

While you need to add type hints in some form when using data classes, these types are not enforced at runtime. The following code runs without any problems:

虽然在使用数据类时需要以某种形式添加类型提示,但是在运行时不会强制使用这些类型。 下面的代码运行没有任何问题:

This is how typing in Python usually works: Python is and will always be a dynamically typed language. To actually catch type errors, type checkers like Mypy can be run on your source code.

这就是Python中通常可以正常工作的方式: Python一直是并且将永远是一种动态类型化的语言 。 要实际捕获类型错误,可以在源代码上运行Mypy等类型检查器。

新增方法 (Adding Methods)

You already know that a data class is just a regular class. That means that you can freely add your own methods to a data class. As an example, let us calculate the distance between one position and another, along the Earth’s surface. One way to do this is by using the haversine formula:

您已经知道数据类只是常规类。 这意味着您可以自由地将自己的方法添加到数据类。 例如,让我们计算沿地球表面一个位置与另一个位置之间的距离。 一种方法是使用Haversine公式

The haversine formula

You can add a .distance_to() method to your data class just like you can with normal classes:

您可以向数据类添加.distance_to()方法,就像使用普通类一样:

 from from dataclasses dataclasses import import dataclass
dataclass
from from math math import import asinasin , , coscos , , radiansradians , , sinsin , , sqrt

sqrt

@dataclass
@dataclass
class class PositionPosition :
    :
    namename : : str
    str
    lonlon : : float float = = 0.0
    0.0
    latlat : : float float = = 0.0

    0.0

    def def distance_todistance_to (( selfself , , otherother ):
        ):
        r r = = 6371  6371  # Earth radius in kilometers
        # Earth radius in kilometers
        lam_1lam_1 , , lam_2 lam_2 = = radiansradians (( selfself .. lonlon ), ), radiansradians (( otherother .. lonlon )
        )
        phi_1phi_1 , , phi_2 phi_2 = = radiansradians (( selfself .. latlat ), ), radiansradians (( otherother .. latlat )
        )
        h h = = (( sinsin (((( phi_2 phi_2 - - phi_1phi_1 ) ) / / 22 )) **** 2
             2
             + + coscos (( phi_1phi_1 ) ) * * coscos (( phi_2phi_2 ) ) * * sinsin (((( lam_2 lam_2 - - lam_1lam_1 ) ) / / 22 )) **** 22 )
        )
        return return 2 2 * * r r * * asinasin (( sqrtsqrt (( hh ))
))

It works as you would expect:

它可以像您期望的那样工作:

更灵活的数据类 (More Flexible Data Classes)

So far, you have seen some of the basic features of the data class: it gives you some convenience methods, and you can still add default values and other methods. Now you will learn about some more advanced features like parameters to the @dataclass decorator and the field() function. Together, they give you more control when creating a data class.

到目前为止,您已经看到了数据类的一些基本功能:它为您提供了一些方便的方法,并且您仍然可以添加默认值和其他方法。 现在,您将了解一些更高级的功能,例如@dataclass装饰器的参数和field()函数。 它们一起为您提供了创建数据类时的更多控制权。

Let us return to the playing card example you saw at the beginning of the tutorial and add a class containing a deck of cards while we are at it:

让我们返回您在教程开始时看到的纸牌示例,并添加一个包含一副纸牌的类:

 from from dataclasses dataclasses import import dataclass
dataclass
from from typing typing import import List

List

@dataclass
@dataclass
class class PlayingCardPlayingCard :
    :
    rankrank : : str
    str
    suitsuit : : str

str

@dataclass
@dataclass
class class DeckDeck :
    :
    cardscards : : ListList [[ PlayingCardPlayingCard ]
]

A simple deck containing only two cards can be created like this:

可以这样创建一个仅包含两张牌的简单牌组:

高级默认值 (Advanced Default Values)

Say that you want to give a default value to the Deck. It would for example be convenient if Deck() created a regular (French) deck of 52 playing cards. First, specify the different ranks and suits. Then, add a function make_french_deck() that creates a list of instances of PlayingCard:

假设您要为Deck提供默认值。 例如,如果Deck()创建一个包含52张扑克牌的常规(法语)套牌 ,将会很方便。 首先,指定不同的等级和西装。 然后,添加一个函数make_french_deck() ,该函数创建PlayingCard实例的列表:

 RANKS RANKS = = '2 3 4 5 6 7 8 9 10 J Q K A''2 3 4 5 6 7 8 9 10 J Q K A' .. splitsplit ()
()
SUITS SUITS = = '♣ ♢ ♡ ♠''♣ ♢ ♡ ♠' .. splitsplit ()

()

def def make_french_deckmake_french_deck ():
    ():
    return return [[ PlayingCardPlayingCard (( rr , , ss ) ) for for s s in in SUITS SUITS for for r r in in RANKSRANKS ]
]

For fun, the four different suits are specified using their Unicode symbols.

为了娱乐,使用它们的Unicode符号指定了四种不同的花色。

Note: Above, we used Unicode glyphs like directly in the source code. We could do this because Python supports writing source code in UTF-8 by default. Refer to this page on Unicode input for how to enter these on your system. You could also enter the Unicode symbols for the suits using N named character escapes (like N{BLACK SPADE SUIT}) or u Unicode escapes (like u2660).

注意:上面,我们直接在源代码中使用了像这样的Unicode字形。 我们之所以可以这样做,是因为Python默认支持使用UTF-8编写源代码 。 有关如何在系统上输入Unicode输入的信息 ,请参阅此页面 。 您还可以使用N命名字符转义符(例如N{BLACK SPADE SUIT} )或u Unicode转义u2660 (例如u2660 )输入西服的Unicode符号。

To simplify comparisons of cards later, the ranks and suits are also listed in their usual order.

为了简化以后的牌比较,等级和西服也按其通常的顺序列出。

In theory, you could now use this function to specify a default value for Deck.cards:

从理论上讲,您现在可以使用此功能为Deck.cards指定默认值:

 from from dataclasses dataclasses import import dataclass
dataclass
from from typing typing import import List

List

@dataclass
@dataclass
class class DeckDeck :  :  # Will NOT work
    # Will NOT work
    cardscards : : ListList [[ PlayingCardPlayingCard ] ] = = make_french_deckmake_french_deck ()
()

Don’t do this! This introduces one of the most common anti-patterns in Python: using mutable default arguments. The problem is that all instances of Deck will use the same list object as the default value of the .cards property. This means that if, say, one card is removed from one Deck, then it disappears from all other instances of Deck as well. Actually, data classes try to prevent you from doing this, and the code above will raise a ValueError.

不要这样! 这引入了Python中最常见的反模式之一: 使用可变的默认参数 。 问题在于, Deck所有实例都将使用与.cards属性的默认值相同的列表对象。 这意味着,例如,如果从一张Deck移除了一张牌,那么它也将从所有其他Deck实例中消失。 实际上,数据类试图阻止您执行此操作 ,并且上面的代码将引发ValueError

Instead, data classes use something called a default_factory to handle mutable default values. To use default_factory (and many other cool features of data classes), you need to use the field() specifier:

相反,数据类使用称为default_factory东西来处理可变的默认值。 要使用default_factory (以及数据类的许多其他出色功能),您需要使用field()说明符:

The argument to default_factory can be any zero parameter callable. Now it is easy to create a full deck of playing cards:

default_factory的参数可以是任何零参数可调用。 现在很容易创建一整套扑克牌:

 >>> >>>  DeckDeck ()
()
Deck(cards=[PlayingCard(rank='2', suit='♣'), PlayingCard(rank='3', suit='♣'), ...
Deck(cards=[PlayingCard(rank='2', suit='♣'), PlayingCard(rank='3', suit='♣'), ...
            PlayingCard(rank='K', suit='♠'), PlayingCard(rank='A', suit='♠')])
            PlayingCard(rank='K', suit='♠'), PlayingCard(rank='A', suit='♠')])

The field() specifier is used to customize each field of a data class individually. You will see some other examples later. For reference, these are the parameters field() supports:

field()说明符用于分别自定义数据类的每个字段。 稍后您将看到其他示例。 作为参考,这些是field()支持的参数:

  • default: Default value of the field
  • default_factory: Function that returns the initial value of field
  • init: Use field in .__init__() method? (Default is True.)
  • repr: Use field in repr of object? (Default is True.)
  • compare: Include field in comparisons? (Default is True.)
  • hash: Include field when calculating hash()? (Default is to use the same as for compare.)
  • metadata: A mapping with information about the field
  • default :字段的默认值
  • default_factory :返回字段初始值的函数
  • init :在.__init__()方法中使用字段? (默认为True 。)
  • repr :在对象的repr中使用字段? (默认为True 。)
  • compare :在比较中包括字段? (默认为True 。)
  • hash :计算hash()时包含字段吗? (默认为与compare相同。)
  • metadata :包含有关字段信息的映射

In the Position example, you saw how to add simple default values by writing lat: float = 0.0. However, if you also want to customize the field, for instance to hide it in the repr, you need to use the default parameter: lat: float = field(default=0.0, repr=False). You may not specify both default and default_factory.

在“ Position示例中,您看到了如何通过编写lat: float = 0.0添加简单的默认值lat: float = 0.0 。 但是,如果您还想自定义字段,例如将其隐藏在repr ,则需要使用default参数: lat: float = field(default=0.0, repr=False) 。 您不能同时指定defaultdefault_factory

The metadata parameter is not used by the data classes themselves but is available for you (or third party packages) to attach information to fields. In the Position example, you could for instance specify that latitude and longitude should be given in degrees:

数据类本身不使用metadata参数,但是您(或第三方软件包)可以使用它将信息附加到字段。 在“ Position示例中,例如,您可以指定纬度和经度应以度为单位:

The metadata (and other information about a field) can be retrieved using the fields() function (note the plural s):

可以使用fields()函数(注意复数s)来检索元数据(以及有关字段的其他信息):

 >>> >>>  from from dataclasses dataclasses import import fields
fields
>>> >>>  fieldsfields (( PositionPosition )
)
(Field(name='name',type=<class 'str'>,...,metadata={}),
(Field(name='name',type=<class 'str'>,...,metadata={}),
 Field(name='lon',type=<class 'float'>,...,metadata={'unit': 'degrees'}),
 Field(name='lon',type=<class 'float'>,...,metadata={'unit': 'degrees'}),
 Field(name='lat',type=<class 'float'>,...,metadata={'unit': 'degrees'}))
 Field(name='lat',type=<class 'float'>,...,metadata={'unit': 'degrees'}))
>>> >>>  lat_unit lat_unit = = fieldsfields (( PositionPosition )[)[ 22 ]] .. metadatametadata [[ 'unit''unit' ]
]
>>> >>>  lat_unit
lat_unit
'degrees'
'degrees'

您需要代理吗? (You Need Representation?)

Recall that we can create decks of cards out of thin air:

回想一下,我们可以凭空创建纸牌:

While this representation of a Deck is explicit and readable, it is also very verbose. I have deleted 48 of the 52 cards in the deck in the output above. On a 80-column display, simply printing the full Deck takes up 22 lines! Let us add a more concise representation. In general, a Python object has two different string representations:

尽管Deck这种表示方式是明确且可读的,但也非常冗长。 我在上面的输出中删除了甲板上52张卡中的48张。 在80列的显示器上,仅打印整个Deck需要22行! 让我们添加一个更简洁的表示。 通常,Python对象具有两种不同的字符串表示形式

  • repr(obj) is defined by obj.__repr__() and should return a developer-friendly representation of obj. If possible, this should be code that can recreate obj. Data classes do this.

  • str(obj) is defined by obj.__str__() and should return a user-friendly representation of obj. Data classes do not implement a .__str__() method, so Python will fall back to the .__repr__() method.

  • repr(obj)obj.__repr__()定义,并且应返回obj的开发人员友好表示形式。 如果可能,这应该是可以重新创建obj代码。 数据类可以做到这一点。

  • str(obj)obj.__str__()定义,并且应返回obj的用户友好表示形式。 数据类未实现.__str__()方法,因此Python将退回到.__repr__()方法。

Let us implement a user-friendly representation of a PlayingCard:

让我们实现一个PlayingCard的用户友好表示形式:

 from from dataclasses dataclasses import import dataclass

dataclass

@dataclass
@dataclass
class class PlayingCardPlayingCard :
    :
    rankrank : : str
    str
    suitsuit : : str

    str

    def def __str____str__ (( selfself ):
        ):
        return return ff '{self.suit}{self.rank}'
'{self.suit}{self.rank}'

The cards now look much nicer, but the deck is still as verbose as ever:

这些卡片现在看起来好多了,但是牌组仍然像以前一样冗长:

To show that it is possible to add your own .__repr__() method as well, we will violate the principle that it should return code that can recreate an object. Practicality beats purity after all. The following code adds a more concise representation of the Deck:

为了表明也可以添加自己的.__repr__()方法,我们将违反其应返回可重新创建对象的代码的原则。 实用性毕竟胜过纯洁 。 以下代码添加了Deck的更简洁表示:

 from from dataclasses dataclasses import import dataclassdataclass , , field
field
from from typing typing import import List

List

@dataclass
@dataclass
class class DeckDeck :
    :
    cardscards : : ListList [[ PlayingCardPlayingCard ] ] = = fieldfield (( default_factorydefault_factory == make_french_deckmake_french_deck )

    )

    def def __repr____repr__ (( selfself ):
        ):
        cards cards = = ', '', ' .. joinjoin (( ff '{c!s}' '{c!s}' for for c c in in selfself .. cardscards )
        )
        return return ff '{self.__class__.__name__}({cards})'
'{self.__class__.__name__}({cards})'

Note the !s specifier in the {c!s} format string. It means that we explicitly want to use the str() representation of PlayingCards. With the new .__repr__(), the representation of Deck is easier on the eyes:

注意{c!s}格式字符串中的!s说明符。 这意味着我们明确地想要使用PlayingCardstr()表示形式。 使用新的.__repr__()Deck的表示在眼睛上更容易:

比较卡 (Comparing Cards)

In many card games, cards are compared to each other. For instance in a typical trick taking game, the highest card takes the trick. As it is currently implemented, the PlayingCard class does not support this kind of comparison:

在许多纸牌游戏中,纸牌相互比较。 例如,在一个典型的把戏中,最高的牌会把戏。 由于当前已实现, PlayingCard类不支持这种比较:

 >>> >>>  queen_of_hearts queen_of_hearts = = PlayingCardPlayingCard (( 'Q''Q' , , '♡''♡' )
)
>>> >>>  ace_of_spades ace_of_spades = = PlayingCardPlayingCard (( 'A''A' , , '♠''♠' )
)
>>> >>>  ace_of_spades ace_of_spades > > queen_of_hearts
queen_of_hearts
TypeError: '>' not supported between instances of 'Card' and 'Card'
TypeError: '>' not supported between instances of 'Card' and 'Card'

This is, however, (seemingly) easy to rectify:

但是,这(似乎)很容易纠正:

The @dataclass decorator has two forms. So far you have seen the simple form where @dataclass is specified without any parentheses and parameters. However, you can also give parameters to the @dataclass() decorator within parentheses. The following parameters are supported:

@dataclass装饰器有两种形式。 到目前为止,您已经看到了一种简单的形式,其中指定@dataclass不带任何括号和参数。 但是,您也可以在括号内为@dataclass()装饰器提供参数。 支持以下参数:

  • init: Add .__init__() method? (Default is True.)
  • repr: Add .__repr__() method? (Default is True.)
  • eq: Add .__eq__() method? (Default is True.)
  • order: Add ordering methods? (Default is False.)
  • unsafe_hash: Force the addition of a .__hash__() method? (Default is False.)
  • frozen: If True, assigning to fields raise an exception. (Default is False.)
  • init :添加.__init__()方法? (默认为True 。)
  • repr :添加.__repr__()方法? (默认为True 。)
  • eq :添加.__eq__()方法? (默认为True 。)
  • order :添加订购方式? (默认为False 。)
  • unsafe_hash :强制添加.__hash__()方法吗? (默认为False 。)
  • frozen :如果为True ,则分配给字段会引发异常。 (默认为False 。)

See the original PEP for more information about each parameter. After setting order=True, instances of PlayingCard can be compared:

有关每个参数的更多信息,请参阅原始PEP 。 设置order=True ,可以比较PlayingCard实例:

 >>> >>>  queen_of_hearts queen_of_hearts = = PlayingCardPlayingCard (( 'Q''Q' , , '♡''♡' )
)
>>> >>>  ace_of_spades ace_of_spades = = PlayingCardPlayingCard (( 'A''A' , , '♠''♠' )
)
>>> >>>  ace_of_spades ace_of_spades > > queen_of_hearts
queen_of_hearts
False
False

How are the two cards compared though? You have not specified how the ordering should be done, and for some reason Python seems to believe that a Queen is higher than an Ace…

不过,这两张卡的比较情况如何? 您尚未指定排序的方式,由于某种原因,Python似乎认为Queen高于Ace。

It turns out that data classes compare objects as if they were tuples of their fields. In other words, a Queen is higher than an Ace because 'Q' comes after 'A' in the alphabet:

事实证明,数据类将对象视为字段的元组进行比较。 换句话说,皇后高于王牌,因为字母中的'Q''A'之后:

That does not really work for us. Instead, we need to define some kind of sort index that uses the order of RANKS and SUITS. Something like this:

这对我们来说真的不起作用。 相反,我们需要定义某种排序索引,该排序索引使用RANKSSUITS的顺序。 像这样:

 >>> >>>  RANKS RANKS = = '2 3 4 5 6 7 8 9 10 J Q K A''2 3 4 5 6 7 8 9 10 J Q K A' .. splitsplit ()
()
>>> >>>  SUITS SUITS = = '♣ ♢ ♡ ♠''♣ ♢ ♡ ♠' .. splitsplit ()
()
>>> >>>  card card = = PlayingCardPlayingCard (( 'Q''Q' , , '♡''♡' )
)
>>> >>>  RANKSRANKS .. indexindex (( cardcard .. rankrank ) ) * * lenlen (( SUITSSUITS ) ) + + SUITSSUITS .. indexindex (( cardcard .. suitsuit )
)
42
42

For PlayingCard to use this sort index for comparisons, we need to add a field .sort_index to the class. However, this field should be calculated from the other fields .rank and .suit automatically. This is exactly what the special method .__post_init__() is for. It allows for special processing after the regular .__init__() method is called:

为了使PlayingCard使用此排序索引进行比较,我们需要在类中添加一个字段.sort_index 。 但是,应该从其他字段.rank.suit自动计算出该字段。 这正是.__post_init__()特殊方法的.__post_init__() 。 在调用常规.__init__()方法之后,它允许进行特殊处理:

Note that .sort_index is added as the first field of the class. That way, the comparison is first done using .sort_index and only if there are ties are the other fields used. Using field(), you must also specify that .sort_index should not be included as a parameter in the .__init__() method (because it is calculated from the .rank and .suit fields). To avoid confusing the user about this implementation detail, it is probably also a good idea to remove .sort_index from the repr of the class.

请注意, .sort_index被添加为该类的第一个字段。 这样,首先使用.sort_index进行比较,并且仅在有联系的情况下使用其他字段。 使用field() ,还必须指定.sort_index不应作为.__init__()方法的参数包括在内(因为它是根据.rank.suit字段计算得出的)。 为了避免使用户对实现细节感到困惑,从类的repr中删除.sort_index也是一个好主意。

Finally, aces are high:

最后,ace高:

 >>> >>>  queen_of_hearts queen_of_hearts = = PlayingCardPlayingCard (( 'Q''Q' , , '♡''♡' )
)
>>> >>>  ace_of_spades ace_of_spades = = PlayingCardPlayingCard (( 'A''A' , , '♠''♠' )
)
>>> >>>  ace_of_spades ace_of_spades > > queen_of_hearts
queen_of_hearts
True
True

You can now easily create a sorted deck:

您现在可以轻松地创建一个排序的牌组:

Or, if you don’t care about sorting, this is how you draw a random hand of 10 cards:

或者,如果您不关心排序,可以通过以下方法随机抽取10张牌:

 >>> >>>  from from random random import import sample
sample
>>> >>>  DeckDeck (( samplesample (( make_french_deckmake_french_deck (), (), kk == 1010 ))
))
Deck(♢2, ♡A, ♢10, ♣2, ♢3, ♠3, ♢A, ♠8, ♠9, ♠2)
Deck(♢2, ♡A, ♢10, ♣2, ♢3, ♠3, ♢A, ♠8, ♠9, ♠2)

Of course, you don’t need compare=True for that…

当然,您不需要compare=True

不变的数据类 (Immutable Data Classes)

One of the defining features of the namedtuple you saw earlier is that it is immutable. That is, the value of its fields may never change. For many types of data classes, this is a great idea! To make a data class immutable, set frozen=True when you create it. For example, the following is an immutable version of the Position class you saw earlier:

您前面看到的namedtuple的定义特征之一是它是不可变的 。 也就是说,其字段的值可能永远不会改变。 对于许多类型的数据类,这是一个好主意! 要使数据类不可变,请在创建数据集时将其设置为frozen=True 。 例如,以下是您之前看到Position类的不可变版本:

In a frozen data class, you can not assign values to the fields after creation:

在冻结的数据类中,创建后无法将值分配给字段:

 >>> >>>  pos pos = = PositionPosition (( 'Oslo''Oslo' , , 10.810.8 , , 59.959.9 )
)
>>> >>>  pospos .. name
name
'Oslo'
'Oslo'
>>> >>>  pospos .. name name = = 'Stockholm'
'Stockholm'
dataclasses.FrozenInstanceError: cannot assign to field 'name'
dataclasses.FrozenInstanceError: cannot assign to field 'name'

Be aware though that if your data class contains mutable fields, those might still change. This is true for all nested data structures in Python (see this video for further info):

但是请注意,如果您的数据类包含可变字段,则这些字段可能仍会更改。 对于Python中的所有嵌套数据结构都是如此( 有关更多信息,请参见此视频 ):

Even though both ImmutableCard and ImmutableDeck are immutable, the list holding cards is not. You can therefore still change the cards in the deck:

即使ImmutableCardImmutableDeck都是不可变的,但列表cards却不是。 因此,您仍然可以在卡座中更换卡:

 >>> >>>  queen_of_hearts queen_of_hearts = = ImmutableCardImmutableCard (( 'Q''Q' , , '♡''♡' )
)
>>> >>>  ace_of_spades ace_of_spades = = ImmutableCardImmutableCard (( 'A''A' , , '♠''♠' )
)
>>> >>>  deck deck = = ImmutableDeckImmutableDeck ([([ queen_of_heartsqueen_of_hearts , , ace_of_spadesace_of_spades ])
])
>>> >>>  deck
deck
ImmutableDeck(cards=[ImmutableCard(rank='Q', suit='♡'), ImmutableCard(rank='A', suit='♠')])
ImmutableDeck(cards=[ImmutableCard(rank='Q', suit='♡'), ImmutableCard(rank='A', suit='♠')])
>>> >>>  deckdeck .. cardscards [[ 00 ] ] = = ImmutableCardImmutableCard (( '7''7' , , '♢''♢' )
)
>>> >>>  deck
deck
ImmutableDeck(cards=[ImmutableCard(rank='7', suit='♢'), ImmutableCard(rank='A', suit='♠')])
ImmutableDeck(cards=[ImmutableCard(rank='7', suit='♢'), ImmutableCard(rank='A', suit='♠')])

To avoid this, make sure all fields of an immutable data class use immutable types (but remember that types are not enforced at runtime). The ImmutableDeck should be implemented using a tuple instead of a list.

为避免这种情况,请确保不可变数据类的所有字段都使用不可变类型(但请记住,在运行时不会强制使用类型)。 ImmutableDeck应该使用元组而不是列表来实现。

遗产 (Inheritance)

You can subclass data classes quite freely. As an example, we will extend our Position example with a country field and use it to record capitals:

您可以相当自由地对数据类进行类化。 例如,我们将使用country字段扩展“ Position示例,并使用它来记录资本:

In this simple example, everything works without a hitch:

在这个简单的示例中,一切正常进行:

 >>> >>>  CapitalCapital (( 'Oslo''Oslo' , , 10.810.8 , , 59.959.9 , , 'Norway''Norway' )
)
Capital(name='Oslo', lon=10.8, lat=59.9, country='Norway')
Capital(name='Oslo', lon=10.8, lat=59.9, country='Norway')

The country field of Capital is added after the three original fields in Position. Things get a little more complicated if any fields in the base class have default values:

Position的三个原始字段之后添加了Capitalcountry字段。 如果基类中的任何字段都具有默认值,则事情会变得更加复杂:

This code will immediately crash with a TypeError complaining that “non-default argument ‘country’ follows default argument.” The problem is that our new country field has no default value, while the lon and lat fields have default values. The data class will try to write an .__init__() method with the following signature:

这段代码会立即因TypeError崩溃,并抱怨“非默认参数'country'跟随默认参数。” 问题在于我们的新country字段没有默认值,而lonlat字段具有默认值。 数据类将尝试编写具有以下签名的.__init__()方法:

 def def __init____init__ (( namename : : strstr , , lonlon : : float float = = 0.00.0 , , latlat : : float float = = 0.00.0 , , countrycountry : : strstr ):
    ):
    ...
...

However, this is not valid Python. If a parameter has a default value, all following parameters must also have a default value. In other words, if a field in a base class has a default value, then all new fields added in a subclass must have default values as well.

但是,这不是有效的Python。 如果参数具有默认值,则所有以下所有参数也必须具有默认值 。 换句话说,如果基类中的字段具有默认值,则子类中添加的所有新字段也必须具有默认值。

Another thing to be aware of is how fields are ordered in a subclass. Starting with the base class, fields are ordered in the order in which they are first defined. If a field is redefined in a subclass, its order does not change. For example, if you define Position and Capital as follows:

要注意的另一件事是子类中字段的排序方式。 从基类开始,将按照字段的定义顺序对其进行排序。 如果在子类中重新定义了字段,则其顺序不会改变。 例如,如果您按以下方式定义PositionCapital

Then the order of the fields in Capital will still be name, lon, lat, country. However, the default value of lat will be 40.0.

然后,“ Capital ”中字段的顺序仍然是namelonlatcountry 。 但是, lat的默认值为40.0

 >>> >>>  CapitalCapital (( 'Madrid''Madrid' , , countrycountry == 'Spain''Spain' )
)
Capital(name='Madrid', lon=0.0, lat=40.0, country='Spain')
Capital(name='Madrid', lon=0.0, lat=40.0, country='Spain')

优化数据类别 (Optimizing Data Classes)

I’m going to end this tutorial with a few words about slots. Slots can be used to make classes faster and use less memory. Data classes have no explicit syntax for working with slots, but the normal way of creating slots works for data classes as well. (They really are just regular classes!)

我将在本教程的结尾部分介绍一些关于插槽的信息 。 插槽可用于使类更快并且使用更少的内存。 数据类没有使用插槽的显式语法,但是创建插槽的常规方法也适用于数据类。 (他们实际上只是普通班!)

Essentially, slots are defined using .__slots__ to list the variables on a class. Variables or attributes not present in .__slots__ may not be defined. Furthermore, a slots class may not have default values.

本质上,使用.__slots__定义插槽以列出类中的变量。 不能定义.__slots__不存在的变量或属性。 此外,插槽类别可能没有默认值。

The benefit of adding such restrictions is that certain optimizations may be done. For instance, slots classes take up less memory, as can be measured using Pympler:

添加此类限制的好处是可以进行某些优化。 例如,插槽类占用的内存更少,可以使用Pympler进行测量:

 >>> >>>  from from pympler pympler import import asizeof
asizeof
>>> >>>  simple simple = = SimplePositionSimplePosition (( 'London''London' , , -- 0.10.1 , , 51.551.5 )
)
>>> >>>  slot slot = = SlotPositionSlotPosition (( 'Madrid''Madrid' , , -- 3.73.7 , , 40.440.4 )
)
>>> >>>  asizeofasizeof .. asizesofasizesof (( simplesimple , , slotslot )
)
(440, 248)
(440, 248)

Similarly, slots classes are typically faster to work with. The following example measures the speed of attribute access on a slots data class and a regular data class using timeit from the standard library.

同样,插槽类通常可以更快地使用。 以下示例使用标准库中的timeit来测量对插槽数据类和常规数据类的属性访问的速度。

In this particular example, the slot class is about 35% faster.

在此特定示例中,广告位类别的速度提高了约35%。

结论 (Conclusion)

Data classes are one of new features of Python 3.7. With data classes, you do not have to write boilerplate code to get proper initialization, representation, and comparisons for your objects.

数据类是Python 3.7的新功能之一。 使用数据类,您不必编写样板代码即可对对象进行正确的初始化,表示和比较。

You have seen how to define your own data classes, as well as:

您已经了解了如何定义自己的数据类,以及:

  • How to add default values to the fields in your data class
  • How to customize the ordering of data class objects
  • How to work with immutable data classes
  • How inheritance works for data classes
  • 如何向数据类中的字段添加默认值
  • 如何自定义数据类对象的顺序
  • 如何使用不可变数据类
  • 继承如何作用于数据类

If you do not yet have Python 3.7, there is also a backport for Python 3.6. Go forth and write less code!

如果您还没有Python 3.7,那么还可以使用Python 3.6反向端口 。 继续写更少的代码!

翻译自: https://www.pybloggers.com/2018/05/the-ultimate-guide-to-data-classes-in-python-3-7/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值