Collections之namedtuple

最新推荐文章于 2023-01-24 12:04:07 发布

june_francis

最新推荐文章于 2023-01-24 12:04:07 发布

阅读量7.4k

点赞数 12

分类专栏： python python库

python 同时被 2 个专栏收录

78 篇文章 11 订阅

订阅专栏

python库

24 篇文章 0 订阅

订阅专栏

前言

今天给大家再介绍一款工具，非常的实用，这款工具就是Python标准库collections里面的namedtuple。
关于它的介绍，官网给出了明确的定义：

Named tuples assign meaning to each position in a tuple and allow for more readable, self-documenting code. They can be used wherever regular tuples are used, and they add the ability to access fields by name instead of position index.

简而言之，namedtuple实际上能在大多数常规元组tuple应用的场景下去使用，而且可读性更强、内存占用也不会太多。

一般我们会将从文件或者数据源（数据库等等）中读取出来的数据使用namedtuple去进行转化，让原始数据代表的含义依然能在这样的数据结构中保留，增加数据的可读性和操作的便捷性。

正文

那么我们通过几个案例去学习一下它的使用方法。

语法：
collections.namedtuple(typename, field_names, *, verbose=False, rename=False, module=None)

其中：

typename：实际上就是你通过namedtuple创建的一个元组的子类的类名，通过这样的方式我们可以初始化各种各样的实例化元组对象。

field_names：类似于字典的key，在这里定义的元组可以通过这样的key去获取里面对应索引位置的元素值，这样的key可以是列表，也可以是用空格、/和逗号这样的分隔符隔开的字符串。

rename：如果rename指定为True，那么你的field_names里面不能包含有非Python标识符，Python中的关键字以及重复的name，如果有，它会默认给你重命名成‘_index’的样式，这个index表示该name在field_names中的索引，例：['abc', 'def', 'ghi', 'abc'] 将被转换成['abc', '_1', 'ghi', '_3']。

其它两个参数不常用，这里不再赘述，有需要的同学请移步官网：
https://docs.python.org/3.6/library/collections.html

首先来一个简单的例子

>>>from collections import namedtuple

>>>Point = namedtuple('Point', ['x', 'y'])
>>>p = Point(11, y=22)     # 可以使用关键字参数和位置参数初始化namedtuple

>>> p[0] + p[1]             # 可以使用索引去获取namedtuple里面的元素
33

>>> x, y = p                # 可以将namedtuple拆包
>>> x, y
(11, 22)

>>> p.x + p.y               # 使用对应的字段名字也可以获取namedtuple里面的元素
33

>>> p                       # 使用类似name=value的样式增加了数据的可读性
Point(x=11, y=22)

其它隐藏属性

其实除了继承tuple的一些方法和特性之外，namedtuple还支持额外的3个方法和2属性：

classmethod somenamedtuple._make(iterable)

 # 可以用现有的序列或者可迭代对象去实例化一个namedtuple
 >>> t = [11, 22]
 >>> Point._make(t)
 Point(x=11, y=22)

somenamedtuple._asdict()

 # 可以将namedtuple对象转化成有序字典OrderedDict
 >>> p = Point(x=11, y=22)
 >>> p._asdict()
 OrderedDict([('x', 11), ('y', 22)])

*somenamedtuple._replace(*kwargs)

 # 通过这个方法我们可以实现对指定name的元素的值进行替换
 >>> p = Point(x=11, y=22)
 >>> p._replace(x=33)
 Point(x=33, y=22)

somenamedtuple._source

 # 我们可以通过namedtuple对象的_source属性查看它的字符串形式的Python源码
 >>> p = Point(x=11, y=22)
 >>> p._source
 "from builtins import property as _property, tuple as _tuple\nfrom operator import itemgetter as _itemgetter\nfrom collections import OrderedDict\n\nclass Point(tuple):\n    'Point(x, y)'\n\n    __slots__ = ()\n\n    _fields = ('x', 'y')\n\n    def __new__(_cls, x, y):\n        'Create new instance of Point(x, y)'\n        return _tuple.__new__(_cls, (x, y))\n\n    @classmethod\n    def _make(cls, iterable, new=tuple.__new__, len=len):\n        'Make a new Point object from a sequence or iterable'\n        result = new(cls, iterable)\n        if len(result) != 2:\n            raise TypeError('Expected 2 arguments, got %d' % len(result))\n        return result\n\n    def _replace(_self, **kwds):\n        'Return a new Point object replacing specified fields with new values'\n        result = _self._make(map(kwds.pop, ('x', 'y'), _self))\n        if kwds:\n            raise ValueError('Got unexpected field names: %r' % list(kwds))\n        return result\n\n    def __repr__(self):\n        'Return a nicely formatted representation string'\n        return self.__class__.__name__ + '(x=%r, y=%r)' % self\n\n    def _asdict(self):\n        'Return a new OrderedDict which maps field names to their values.'\n        return OrderedDict(zip(self._fields, self))\n\n    def __getnewargs__(self):\n        'Return self as a plain tuple.  Used by copy and pickle.'\n        return tuple(self)\n\n    x = _property(_itemgetter(0), doc='Alias for field number 0')\n\n    y = _property(_itemgetter(1), doc='Alias for field number 1')\n\n"

somenamedtuple._fields

 # 可以通过这个属性获取namedtuple对象当前的所有字段名field_names
 >>> p._fields            # 获取 field_names，注意是元组形式
 ('x', 'y')
 
 >>> Color = namedtuple('Color', 'red green blue')
 >>> Pixel = namedtuple('Pixel', Point._fields + Color._fields)
 >>> Pixel(11, 22, 128, 255, 0)
 Pixel(x=11, y=22, red=128, green=255, blue=0)  
 
 # 还可以通过getattr来获取字段名是字符串的元素的值value
 >>> getattr(p, 'x')
 11

 # 把字典通过拆包的形式转换成namedtuple
 >>> d = {'x': 11, 'y': 22}
 >>> Point(**d)
 Point(x=11, y=22)
 
 # 还可以通过自定义namedtuple子类方法实现一些计算的功能和格式化输出的功能
 >>> class Point(namedtuple('Point', ['x', 'y'])):
 ...     __slots__ = ()
 ...     @property
 ...     def hypot(self):
 ...         return (self.x ** 2 + self.y ** 2) ** 0.5
 ...     def __str__(self):
 ...         return 'Point: x=%6.3f  y=%6.3f  hypot=%6.3f' % (self.x, self.y, self.hypot)
 
 >>> for p in Point(3, 4), Point(14, 5/7):
 ...     print(p)
 Point: x= 3.000  y= 4.000  hypot= 5.000
 Point: x=14.000  y= 0.714  hypot=14.018
 
 # 一般给namedtuple添加新的字段我们通过_fields就可以快速完成，而无法通过定义子类方法来完成
 >>> Point3D = namedtuple('Point3D', Point._fields + ('z',))
 
 # 我们可以通过给namedtuple的__doc__属性赋值来修改或添加相应的文档描述
 >>> Book = namedtuple('Book', ['id', 'title', 'authors'])
 >>> Book.__doc__ += ': Hardcover book in active collection'
 >>> Book.id.__doc__ = '13-digit ISBN'
 >>> Book.title.__doc__ = 'Title of first printing'
 >>> Book.authors.__doc__ = 'List of authors sorted by last name'

常用的场景

通过csv和sqlite3模块返回的元组数据一般被用来使用namedtuple进行初始化，这样增加了数据的可读性和操作上的便携性：

EmployeeRecord = namedtuple('EmployeeRecord', 'name, age, title, department, paygrade')

import csv
for emp in map(EmployeeRecord._make, csv.reader(open("employees.csv", "rb"))):
    print(emp.name, emp.title)

import sqlite3
conn = sqlite3.connect('/companydata')
cursor = conn.cursor()
cursor.execute('SELECT name, age, title, department, paygrade FROM employees')
for emp in map(EmployeeRecord._make, cursor.fetchall()):
    print(emp.name, emp.title)

更多的内容，请移步官网：
https://docs.python.org/3.6/library/collections.html