【转】collections模块defaultdict()和namedtuple()

最新推荐文章于 2024-07-22 00:12:13 发布

bro_two

最新推荐文章于 2024-07-22 00:12:13 发布

阅读量190

点赞数

参考资料
廖雪峰python教程
 再谈collections模块defaultdict()和namedtuple()
defaultdict()和namedtuple()是collections模块里面2个实用的扩展类型。一个继承自dict系统内置类型，一个继承自tuple系统内置类型。在扩展的同时都添加了额外的特性，在特定的场合都很实用。

defaultdict()

使用dict时，如果引用的Key不存在，就会抛出KeyError。如果希望key不存在时，返回一个默认值，就可以用defaultdict：

>>> from collections import defaultdict
>>> dd = defaultdict(lambda: 'N/A')
>>> dd['key1'] = 'abc'
>>> dd['key1'] # key1存在
'abc'
>>> dd['key2'] # key2不存在，返回默认值
'N/A'

注意默认值是调用函数返回的，而函数在创建defaultdict对象时传入。

除了在Key不存在时返回默认值，defaultdict的其他行为跟dict是完全一样的。

namedtuple

namedtuple是继承自ttuple的子类。namedtuple和tuple比，有更多特性。namedtuple创建一个和tuple类似的对象，并且规定了tuple元素的个数，并可以用属性而不是索引来引用tuple的某个元素。用namedtuple可以很方便地定义一种数据类型，它具备tuple的不变性，又可以根据属性来引用。

>>> from collections import namedtuple
>>> Point = namedtuple('Point', ['x', 'y'])
>>> p = Point(1, 2)
>>> p.x
1
>>> p.y
2
>>> p[0]
1

class TPoint(tuple):
    'TPoint(x, y)'

    __slots__ = ()

    _fields = ('x', 'y')

    def __new__(_cls, x, y):
        'Create new instance of TPoint(x, y)'
        return _tuple.__new__(_cls, (x, y))

    @classmethod
    def _make(cls, iterable, new=tuple.__new__, len=len):
        'Make a new TPoint object from a sequence or iterable'
        result = new(cls, iterable)
        if len(result) != 2:
            raise TypeError('Expected 2 arguments, got %d' % len(result))
        return result

    def __repr__(self):
        'Return a nicely formatted representation string'
        return self.__class__.__name__ + '(x=%r, y=%r)' % self

    def _asdict(self):
        'Return a new OrderedDict which maps field names to their values'
        return OrderedDict(zip(self._fields, self))

    __dict__ = property(_asdict)

    def _replace(_self, **kwds):
        'Return a new TPoint object replacing specified fields with new values'
        result = _self._make(map(kwds.pop, ('x', 'y'), _self))
        if kwds:
            raise ValueError('Got unexpected field names: %r' % list(kwds))
        return result

    def __getnewargs__(self):
        'Return self as a plain tuple.  Used by copy and pickle.'
        return tuple(self)

    x = _property(_itemgetter(0), doc='Alias for field number 0')

    y = _property(_itemgetter(1), doc='Alias for field number 1')

这里就显示出了namedtuple的一些方法。很明显的看到namedtuple是直接继承自tuple的。

几个重要的方法：

1.把数据变成namedtuple类：

>>> TPoint = namedtuple('TPoint', ['x', 'y'])
>>> t = [11, 22]
>>> p = TPoint._make(t)
>>> p
TPoint(x=11, y=22)

2. 根据namedtuple创建的类生成的类示例，其数据是只读的，如果要进行更新需要调用方法_replace.

>>> p
TPoint(x=11, y=22)
>>> p.y
22
>>> p.y = 33
Traceback (most recent call last):
  File "<pyshell#18>", line 1, in <module>
    p.y = 33
AttributeError: can't set attribute
>>> p._replace(y=33)
TPoint(x=11, y=33)

3.将字典数据转换成namedtuple类型

>>> d = {'x': 44, 'y': 55}
>>> dp = TPoint(**d)
>>> dp
TPoint(x=44, y=55)

4.namedtuple最常用还是出现在处理来csv或者数据库返回的数据上。利用map()函数和namedtuple建立类型的_make（）方法。

EmployeeRecord = namedtuple('EmployeeRecord', 'name, age, title, department, paygrade')

import csv
for emp in map(EmployeeRecord._make, csv.reader(open("employees.csv", "rb"))):
    print(emp.name, emp.title)

# sqlite数据库
import sqlite3
conn = sqlite3.connect('/companydata')
cursor = conn.cursor()
cursor.execute('SELECT name, age, title, department, paygrade FROM employees')
for emp in map(EmployeeRecord._make, cursor.fetchall()):
    print(emp.name, emp.title)

# MySQL 数据库
import mysql
from mysql import connector
from collections import namedtuple
user = 'herbert'
pwd = '######'
host = '127.0.0.1'
db = 'world'
cnx = mysql.connector.connect(user=user, password=pwd, host=host,database=db)
cur.execute("SELECT Name, CountryCode, District, Population FROM CITY where CountryCode = 'CHN' AND Population > 500000")
CityRecord = namedtuple('City', 'Name, Country, Dsitrict, Population')
for city in map(CityRecord._make, cur.fetchall()):
    print(city.Name, city.Population)