字典和集合
字典里的键只有可散列的数据类型才行,值并不需要可散列的数据类型
什么是可散列的数据类型?如果一个对象是可散列的,那么在这个对象的生命周期中,它的散列值是不变 的,而且这个对象需要实现 __hash__() 方法。另外可散列对象还要有 __qe__() 方法,这样才能跟其他键做比较。如果两个可散列对象是相等的,那么它们的散列值一定是一样的。
不可变数据类型(str,bytes和数值类型)都是可散列类型,frozenset也是可散列的。元组只有元组包含的所有元素都是可散列类型的情况下,它才是可散列的。
>>> t = (1, 2, (1, 2))
>>> hash(t)
-43632441908829798
>>> t1 = (1, 2, [1, 2])
>>> hash(t1)
Traceback (most recent call last):
File "", line 1, in
TypeError: unhashable type: 'list'
>>> t2 = (1, 2, frozenset([30, 40]))
>>> hash(t2)
985328935373711578
字典创建方法
>>> a = dict(one=1,two=2,three=3)
>>> b = {"one": 1, "two": 2, "three": 3}
>>> c = dict(zip(["one", "two", "three"], [1, 2, 3]))
>>> d = dict([("two", 2), ("one", 1), ("three", 3)])
>>> e = dict({"three": 3, "one": 1, "two": 2})
>>> a == b == c == d == e
True
字典推导
>>> DIAL_CODES = [
... (86, 'China'),
... (91, 'India'),
... (1, 'United States'),
... (62, 'Indonesia'),
... (55, 'Brazil'),
... (92, 'Pakistan'),
... (880, 'Bangladesh'),
... (234, 'Nigeria'),
... (7, 'Russia'),
... (81, 'Japan')]
>>> country_code = {country: code for code, country in DIAL_CODES}
>>> country_code
{'China': 86, 'India': 91, 'United States': 1, 'Indonesia': 62, 'Brazil': 55, 'Pakistan': 92, 'Bangladesh': 880, 'Nigeria': 234, 'Russia': 7, 'Japan': 81}
>>> {code:country.upper() for country, code in country_code.items() if code < 66}
{1: 'UNITED STATES', 62: 'INDONESIA', 55: 'BRAZIL', 7: 'RUSSIA'}
常见的映射方法
比较 dict, collections.defaultdict, collections.OrderedDict类型的方法列表
setfaultdict的使用
>>> lookup = {}
>>> lookup.setdefault(1, {}).setdefault(2, []).append(3)
>>> lookup
{1: {2: [3]}}
映射的弹性键查询
如果某个键映射不存在,我们希望通过键读取值时候得到默认值。两种方法defaultdict实现
定义dict的子类,子类实现__missing__方法,再找不到键的时候,调用__missing__
字典变种collections.OrderedDict
collections.ChainMap
collections.Counter
子类化UserDict
import collections
class StrKeyDict(collections.UserDict):
def __missing__(self, key):
if isinstance(key, str):
raise KeyError(key)
return self[str(key)]
def __contains__(self, key):
return str(key) in self.data
def __setitem__(self, key, value):
self.data[str(key)] = value
测试如下:
>>> d = StrKeyDict([["2", "two"], ["4", "four"]])
>>> d
{'2': 'two', '4': 'four'}
>>> d["2"]
'two'
>>> d[4]
'four'
>>> d[1]
Traceback (most recent call last):
File "", line 1, in
File "C:\Python37\lib\collections\__init__.py", line 1024, in __getitem__
return self.__class__.__missing__(self, key)
File "D:\learning\blogs\python\流畅python\第三天\learning3.py", line 9, in __missing__
return self[str(key)]
File "C:\Python37\lib\collections\__init__.py", line 1024, in __getitem__
return self.__class__.__missing__(self, key)
File "D:\learning\blogs\python\流畅python\第三天\learning3.py", line 8, in __missing__
raise KeyError(key)
KeyError: '1'
>>> 2 in d
True
>>> 1 in d
False
>>> "2" in d
True
>>> d.get(1)
>>> d.get("2")
'two'
>>> d.get("1")
>>> d.get("1", "N/A")
'N/A'
不可变映射类型
>>> from types import MappingProxyType
>>> d = {1: "A"}
>>> d_proxy = MappingProxyType(d)
>>> d_proxy
mappingproxy({1: 'A'})
>>> d_proxy[1]
'A'
>>> d_proxy[2]
Traceback (most recent call last):
File "", line 1, in
KeyError: 2
>>> d[2] = "B"
>>> d_proxy
mappingproxy({1: 'A', 2: 'B'})
>>> d_proxy[2]
'B'
types 模块中引入了一个封装类名叫 MappingProxyType 。如果给这个类一个映射,它会返回一个只读的映射视图。虽然是个只读视图,但是它是动态的。这意味着如果对原映射做出了改动,我们通过这个视图可以观察到,但是无法通过这个视图对原映射做出修改。
集合
集合的数学运算
集合比较运算符,返回布尔类型
集合其他方法