Python创建不可变的类实例

看了几天的Python可变、不可变、哈希相关的东西,有些地方还是有点懵。

比如:

Python中类的实例是可变的,但是它却能作为字典的key。而字典的key因该是不可变的,不然键的哈希值就不固定了。

我的理解感觉字典的key更准确的说应该是可以哈希的就行,未必一定不可变,只要key有固定的的哈希值,就能找到唯一对应的value,所以类虽然是可变的,但是它是可哈希的,所以可以作为字典的键。

再比如:

Python中元组不可变的,所以可以作为字典的键,但是某些特殊元组就不能作为字典的键,例如(1,[1])这样的元组。

因为样的元组是不可哈希的,所以不能作为字典的key。

官方文档有说明:不可变容器(例如元组和 frozenset)仅当它们的元素均为可哈希时才是可哈希的。

跑题了:

这里简单记录下如何实现一个不可变的类:

class It(object):
    __slots__ = ['vals']

    def __init__(self, vals):
        super().__setattr__('vals',vals)

    def __setattr__(self, key, value):
        raise AttributeError('不可变类')

    def __eq__(self, other):
        # return id(self) == id(other)
        return self.vals == other.vals
    def __hash__(self):
        # return 1
        # return random.randint(0,100)
        return hash(';'.join(self.vals))
        # return hash(id(self))

核心就是__init__方法和__setattr__方法,

__init__方法中调用父类的__setattr__方法来实现实例的初始化,同时重写自己的__setattr__方法,其实就是禁用了__setattr__,让实例创建后不能修改属性,也不能添加新属性。

至于__slots__可有可无,他的作用就是声明一下实例只能有vals这一个属性。

 

参考:

https://stackoverflow.com/questions/42452953/python-3-user-defined-immutable-class-objects

http://www.blog.pythonlibrary.org/2014/01/17/how-to-create-immutable-classes-in-python/

https://www.liaoxuefeng.com/wiki/1016959663602400/1017501655757856

 

关于哈希的官方文档: 

hashable

An object is hashable if it has a hash value which never changes during its lifetime (it needs a __hash__() method), and can be compared to other objects (it needs an __eq__() method). Hashable objects which compare equal must have the same hash value.

Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.

Most of Python’s immutable built-in objects are hashable; mutable containers (such as lists or dictionaries) are not; immutable containers (such as tuples and frozensets) are only hashable if their elements are hashable. Objects which are instances of user-defined classes are hashable by default. They all compare unequal (except with themselves), and their hash value is derived from their id().

 

hashable -- 可哈希

一个对象的哈希值如果在其生命周期内绝不改变,就被称为 可哈希 (它需要具有 __hash__() 方法),并可以同其他对象进行比较(它需要具有 __eq__() 方法)。可哈希对象必须具有相同的哈希值比较结果才会相同。

可哈希性使得对象能够作为字典键或集合成员使用,因为这些数据结构要在内部使用哈希值。

大多数 Python 中的不可变内置对象都是可哈希的;可变容器(例如列表或字典)都不可哈希;不可变容器(例如元组和 frozenset)仅当它们的元素均为可哈希时才是可哈希的。 用户定义类的实例对象默认是可哈希的。 它们在比较时一定不相同(除非是与自己比较),它们的哈希值的生成是基于它们的 id()

object.__hash__(self)

Called by built-in function hash() and for operations on members of hashed collections including setfrozenset, and dict__hash__() should return an integer. The only required property is that objects which compare equal have the same hash value; it is advised to mix together the hash values of the components of the object that also play a part in comparison of objects by packing them into a tuple and hashing the tuple. Example:

def __hash__(self):
    return hash((self.name, self.nick, self.color))

Note

 

hash() truncates the value returned from an object’s custom __hash__() method to the size of a Py_ssize_t. This is typically 8 bytes on 64-bit builds and 4 bytes on 32-bit builds. If an object’s __hash__() must interoperate on builds of different bit sizes, be sure to check the width on all supported builds. An easy way to do this is with python -c "import sys; print(sys.hash_info.width)".

If a class does not define an __eq__() method it should not define a __hash__() operation either; if it defines __eq__() but not __hash__(), its instances will not be usable as items in hashable collections. If a class defines mutable objects and implements an __eq__() method, it should not implement __hash__(), since the implementation of hashable collections requires that a key’s hash value is immutable (if the object’s hash value changes, it will be in the wrong hash bucket).

User-defined classes have __eq__() and __hash__() methods by default; with them, all objects compare unequal (except with themselves) and x.__hash__() returns an appropriate value such that x == y implies both that x is y and hash(x) == hash(y).

A class that overrides __eq__() and does not define __hash__() will have its __hash__() implicitly set to None. When the __hash__() method of a class is None, instances of the class will raise an appropriate TypeError when a program attempts to retrieve their hash value, and will also be correctly identified as unhashable when checking isinstance(obj, collections.abc.Hashable).

If a class that overrides __eq__() needs to retain the implementation of __hash__() from a parent class, the interpreter must be told this explicitly by setting __hash__ = <ParentClass>.__hash__.

If a class that does not override __eq__() wishes to suppress hash support, it should include __hash__ = None in the class definition. A class which defines its own __hash__() that explicitly raises a TypeError would be incorrectly identified as hashable by an isinstance(obj, collections.abc.Hashable) call.

Note

 

By default, the __hash__() values of str and bytes objects are “salted” with an unpredictable random value. Although they remain constant within an individual Python process, they are not predictable between repeated invocations of Python.

This is intended to provide protection against a denial-of-service caused by carefully-chosen inputs that exploit the worst case performance of a dict insertion, O(n^2) complexity. See http://www.ocert.org/advisories/ocert-2011-003.html for details.

Changing hash values affects the iteration order of sets. Python has never made guarantees about this ordering (and it typically varies between 32-bit and 64-bit builds).

See also PYTHONHASHSEED.

Changed in version 3.3: Hash randomization is enabled by default.

hash(object)

Return the hash value of the object (if it has one). Hash values are integers. They are used to quickly compare dictionary keys during a dictionary lookup. Numeric values that compare equal have the same hash value (even if they are of different types, as is the case for 1 and 1.0).

Note

 

For objects with custom __hash__() methods, note that hash() truncates the return value based on the bit width of the host machine. See __hash__() for details.

object.__hash__(self)

通过内置函数 hash() 调用以对哈希集的成员进行操作,属于哈希集的类型包括 setfrozenset 以及 dict__hash__() 应该返回一个整数。对象比较结果相同所需的唯一特征属性是其具有相同的哈希值;建议的做法是把参与比较的对象全部组件的哈希值混在一起,即将它们打包为一个元组并对该元组做哈希运算。例如:

def __hash__(self):
    return hash((self.name, self.nick, self.color))

注解

 

hash() 会从一个对象自定义的 __hash__() 方法返回值中截断为 Py_ssize_t 的大小。通常对 64 位构建为 8 字节,对 32 位构建为 4 字节。如果一个对象的 __hash__() 必须在不同位大小的构建上进行互操作,请确保检查全部所支持构建的宽度。做到这一点的简单方法是使用 python -c "import sys; print(sys.hash_info.width)"

如果一个类没有定义 __eq__() 方法,那么也不应该定义 __hash__() 操作;如果它定义了 __eq__() 但没有定义 __hash__(),则其实例将不可被用作可哈希集的项。如果一个类定义了可变对象并实现了 __eq__() 方法,则不应该实现 __hash__(),因为可哈希集的实现要求键的哈希集是不可变的(如果对象的哈希值发生改变,它将处于错误的哈希桶中)。

用户定义的类默认带有 __eq__() 和 __hash__() 方法;使用它们与任何对象(自己除外)比较必定不相等,并且 x.__hash__() 会返回一个恰当的值以确保 x == y 同时意味着 x is y 且 hash(x) == hash(y)

一个类如果重载了 __eq__() 且没有定义 __hash__() 则会将其 __hash__() 隐式地设为 None。当一个类的 __hash__() 方法为 None 时,该类的实例将在一个程序尝试获取其哈希值时正确地引发 TypeError,并会在检测 isinstance(obj, collections.abc.Hashable) 时被正确地识别为不可哈希对象。

如果一个重载了 __eq__() 的类需要保留来自父类的 __hash__() 实现,则必须通过设置 __hash__ = <ParentClass>.__hash__ 来显式地告知解释器。

如果一个没有重载 __eq__() 的类需要去掉哈希支持,则应该在类定义中包含 __hash__ = None。一个自定义了 __hash__() 以显式地引发 TypeError 的类会被 isinstance(obj, collections.abc.Hashable) 调用错误地识别为可哈希对象。

注解

 

在默认情况下,str 和 bytes 对象的 __hash__() 值会使用一个不可预知的随机值“加盐”。 虽然它们在一个单独 Python 进程中会保持不变,但它们的值在重复运行的 Python 间是不可预测的。

这种做法是为了防止以下形式的拒绝服务攻击:通过仔细选择输入来利用字典插入操作在最坏情况下的执行效率即 O(n^2) 复杂度。详情见 http://www.ocert.org/advisories/ocert-2011-003.html

改变哈希值会影响集合的迭代次序。Python 也从不保证这个次序不会被改变(通常它在 32 位和 64 位构建上是不一致的)。

另见 PYTHONHASHSEED.

在 3.3 版更改: 默认启用哈希随机化。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值