Python学习笔记(九)—— Dict

代码及内容源自《Fluent Python》——Luciano Ramalho 著

建立dict可以通过下面几种方式:

>>> a = dict(one=1,two=2,three=3)
>>> b = {'one':1,'two':2,'three':3}
>>> c = dict(zip(['one','two','three'],[1,2,3]))
>>> d = dict([('two',2),('one',1),('three',3)])
>>> e = dict({'three':3,'one':1,'two':2})
>>> a == b == c == d == e
True
>>> DIAL_CODES = [
    (86,'China'),
    (91,'Inida'),
    (1,'United States'),
    (62,'Indonesia'),
    (55,'Brazil'),
    (92,'Pakistan'),
    (880,'Bangledesh'),
    (234,'Nigeria'),
    (7,'Russia'),
    (81,'Japan')
]

Python 2.7之后,加入了dict推导式(comprehension)用于生成dict实例。下面给出了基于tuples列表用推导式建立dict的例子:

>>> country_code = {country:code for code,country in DIAL_CODES}
>>> country_code
{'Bangledesh': 880,
 'Brazil': 55,
 'China': 86,
 'Indonesia': 62,
 'Inida': 91,
 'Japan': 81,
 'Nigeria': 234,
 'Pakistan': 92,
 'Russia': 7,
 'United States': 1}
>>> {code:country.upper() for country,code in country_code.items() if code<66}
{1: 'UNITED STATES', 7: 'RUSSIA', 55: 'BRAZIL', 62: 'INDONESIA'}

利用setdefault处理缺失的键值

通过d[k]的方式访问dict,如果k并不是现有的键值,则会报错。此时,可以使用d.get(k,default)的替代方式。但是,如果需要同时更新所对应的值时,这样的方式就会显得笨拙,并且效率低下。如下面的例子:

"""index0.py"""

import sys
import re

WORD_RE = re.compile('\w+')

index = {}
with open(sys.argv[1],encoding='utf-8') as fp:
    for line_no, line in enumerate(fp,1):
        for match in WORD_RE.finditer(line):
            word = match.group()
            column_no = match.start()+1
            location = (line_no, column_no)
            #this is ugly; coded like this to make a point
            occurrences = index.get(word,[]) 
            occurrences.append(location)
            index[word] = occurrences

#print in alphabetical order

for word in sorted(index, key=str.upper):
    print(word,index[word])
>>> import this #用于得到zen.txt
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
>>> %run index0.py zen.txt
a [(19, 48), (20, 53)]
Although [(11, 1), (16, 1), (18, 1)]
ambiguity [(14, 16)]
and [(15, 23)]
are [(21, 12)]
aren [(10, 15)]
at [(16, 38)]
bad [(19, 50)]
be [(15, 14), (16, 27), (20, 50)]
beats [(11, 23)]
Beautiful [(3, 1)]
better [(3, 14), (4, 13), (5, 11), (6, 12), (7, 9), (8, 11), (17, 8), (18, 25)]
......

在上面这个例子中,处理“occurrence”的三行代码,通过使用dict.setdefault可以由一行代码来代替,改进后的index0.py如下:

"""index0.py"""

import sys
import re

WORD_RE = re.compile('\w+')

index = {}
with open(sys.argv[1],encoding='utf-8') as fp:
    for line_no, line in enumerate(fp,1):
        for match in WORD_RE.finditer(line):
            word = match.group()
            column_no = match.start()+1
            location = (line_no, column_no)
            #this is where codes are improved
            index.setdefault(word,[]).append(location)

#print in alphabetical order

for word in sorted(index, key=str.upper):
    print(word,index[word])

setdefault返回的是值,因此可以直接更新而不需要做第二次搜索。


defaultdict,处理缺失键值的另一方法

与setdefault的原理类似,如果搜索的键值并不存在,则通过生成空列表并付给这个新的键值,从而避免键值错误,并且可以继续进行值的更新。如下例

"""index_default.py"""

import sys
import re
import collections

WORD_RE = re.compile('\w+')

index = collections.defaultdict(list)
with open(sys.argv[1],encoding='utf-8') as fp:
    for line_no, line in enumerate(fp,1):
        for match in WORD_RE.finditer(line):
            word = match.group()
            column_no = match.start()+1
            location = (line_no, column_no)
            index[word].append(location)

#print in alphabetical order

for word in sorted(index, key=str.upper):
    print(word,index[word])

__missing__方法

使defaultdict能够正常工作的实际上是一个名为__missing__的特殊方法,该方法被所有标准的映射类型所支持。如果你定义了一个dict的子类并提供了__missing__方法,那么当键值不存在时,标准的dict.__getitem__会调用__missing__方法,而不会抛出KeyError。

class StrKeyDict0(dict):

    def __missing__(self,key):
        if isinstance(key,str):
            raise KeyError(key)
        return self[str(key)]

    def get(self,key,default=None):
        try:
            return self[key]
        except KeyError:
            return default

    def __contains__(self,key):
        return key in self.keys() or str(key) in self.keys()
>>> d = StrKeyDict0([('2','two'),('4','four')])
>>> d['2']
'two'
>>> d[4]
'four'
>>> d[1]
---------------------------------------------------------------------------

KeyError                  Traceback (most recent call last)

...    

KeyError: '1'
>>> d.get('2')
'two'
>>> d.get(4)
'four'
>>> d.get(1,'N/A')
'N/A'
>>> 2 in d
True
>>> 1 in d
False
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值