以下代码都是学自‘菜鸟学python 公众号’
1. Orderdict
print(ascii_lowercase) # abcdefghijklmnopqrstuvwxyz
print(dict(zip(ascii_lowercase, range(1,5)))) # {'a': 1, 'b': 2, 'c': 3, 'd': 4}
# 如果我们想按字典加入的顺序输出
d1 = OrderedDict(zip(ascii_lowercase, range(1, 5)))
print(d1) # OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)])
2. defaultdict
解决字典缺省值的缺点
字典一般都是没有缺省值的,比如我们在爬虫的时候,我们希望我们爬取的关键字,如果有值则填入,如果没有则用默认值
# 第一种用法
keys = ['movie_name', 'movie_link', 'rating_num', 'rating_people_num']
d = {}
for k in keys:
d[k] = ''
print(d) # {'movie_name': '', 'movie_link': '', 'rating_num': '', 'rating_people_num': ''}
# 第二种办法
d = {k: '' for k in keys}
print(d) # {'movie_name': '', 'movie_link': '', 'rating_num': '', 'rating_people_num': ''}
# 第三种办法
d = {}
for k in keys:
d.setdefault(k, '')
print(d) # {'movie_name': '', 'movie_link': '', 'rating_num': '', 'rating_people_num': ''}
上面3种解决办法有缺点,比如我们访问一个不存在的key 的时候,如何让它返回一个缺省值
如果复杂一点的数据结构,比如我们这个字典填入的是列表,而不是一个简单的字符串,
这怎么办? 这个时候,需要用到defaultdict
# 比如我们构造一个学生的成绩单,缺省值我们默认为60分
students = defaultdict(lambda: 60)
students['jim'] = 100
students['jack'] = 80
students['rose'] = 70
print(students) # defaultdict(<function <lambda> at 0x02E06270>, {'jim': 100, 'jack': 80, 'rose': 70})
如果我们想获取一个不存在名单里的学生的成绩
print(students['sam']) # 60
print(students) # defaultdict(<function <lambda> at 0x02CE6270>, {'jim': 100, 'jack': 80, 'rose': 70, 'sam': 60})
如果我们是一个字典嵌套集合 的数据, 我们只要把这个字典的缺省值设置为set
然后只要管如何添加数据就行, 不用操作如何初始化字典
d = defaultdict(set)
names = ['jim', 'jack', 'rose']
for index, n in enumerate(names):
d[n].add(('180cm', index))
print(d) # defaultdict(<class 'set'>, {'jim': {('180cm', 0)}, 'jack': {('180cm', 1)}, 'rose': {('180cm', 2)}})
如何计算一个复合列表里面的相同的类别的,或者说合并同类项,
s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
d = defaultdict(list)
for k, v in s:
d[k].append(v)
print(d)
# defaultdict(<class 'list'>, {'yellow': [1, 3], 'blue': [2, 4], 'red': [1]})
3 namedtuple
大部分的时候,我们喜欢用类去封装数据,类对程序的扩展确实有好处
但是有没有轻量级的比类简便的方法,但是可以像类一样访问数据
比如我们有一个数据结构
students = (name, age, weight)
我们要访问这个元组,需要通过下标访问,非常不方便,如果这个元组很长,用下标容易弄错,而且乱序之后不易扩展
如果用一个类去构建,有点杀鸡用牛刀的感觉
这个时候如果用namedtuple, 就很好的解决了这个问题
Students = namedtuple('Students', ('name', 'age', 'weight'))
s1 = Students(name='jack', age=18, weight='70kg')
print(s1) # Students(name='jack', age=18, weight='70kg')
# 然后我们访问这个属性,就可以用类似 类访问属性的方式去访问
print(s1.name) # jack
print(s1.age) # age
print(s1.weight) # 70kg
打印一副牌,牌是2–>10, JQKA, 花色是方块 梅花 黑桃 红心
先看一下代码
Card = namedtuple('Card', ['rank', 'suit'])
# 通过上面的代码,Card 就有了两个属性,一个rank, 一个suit
beer_card = Card('7', 'dismonds')
print(beer_card.rank) # 7
print(beer_card.suit) # dismonds
正式的代码:
ranks = [str(n) for n in range(2, 11)] + list('JQKA')
suits = 'spades diamonds clubs hearts'.split()
cards = [Card(rank, suit) for rank in ranks for suit in suits]
print(cards)
'''
[
Card(rank='2', suit='spades'),Card(rank='2', suit='diamonds'), Card(rank='2', suit='clubs'), Card(rank='2', suit='hearts'),
Card(rank='3', suit='spades'), Card(rank='3', suit='diamonds'), Card(rank='3', suit='clubs'), Card(rank='3', suit='hearts'),
Card(rank='4', suit='spades'), Card(rank='4', suit='diamonds'), Card(rank='4', suit='clubs'), Card(rank='4', suit='hearts'),
Card(rank='5', suit='spades'), Card(rank='5', suit='diamonds'), Card(rank='5', suit='clubs'), Card(rank='5', suit='hearts'),
Card(rank='6', suit='spades'), Card(rank='6', suit='diamonds'), Card(rank='6', suit='clubs'), Card(rank='6', suit='hearts'),
Card(rank='7', suit='spades'), Card(rank='7', suit='diamonds'), Card(rank='7', suit='clubs'), Card(rank='7', suit='hearts'),
Card(rank='8', suit='spades'), Card(rank='8', suit='diamonds'), Card(rank='8', suit='clubs'), Card(rank='8', suit='hearts'),
Card(rank='9', suit='spades'), Card(rank='9', suit='diamonds'), Card(rank='9', suit='clubs'), Card(rank='9', suit='hearts'),
Card(rank='10', suit='spades'), Card(rank='10', suit='diamonds'), Card(rank='10', suit='clubs'), Card(rank='10', suit='hearts'),
Card(rank='J', suit='spades'), Card(rank='J', suit='diamonds'), Card(rank='J', suit='clubs'), Card(rank='J', suit='hearts'),
Card(rank='Q', suit='spades'), Card(rank='Q', suit='diamonds'), Card(rank='Q', suit='clubs'), Card(rank='Q', suit='hearts'),
Card(rank='K', suit='spades'), Card(rank='K', suit='diamonds'), Card(rank='K', suit='clubs'), Card(rank='K', suit='hearts'),
Card(rank='A', suit='spades'), Card(rank='A', suit='diamonds'), Card(rank='A', suit='clubs'), Card(rank='A', suit='hearts')
]
'''
4 Counter
dest = [
('新疆', 138),
('拉萨', 180),
('稻城', 137),
('青海湖', 133),
('色达', 124),
('亚丁', 123),
('敦煌', 116),
('伊犁', 113),
('张掖', 99),
('西藏', 87),
('呼伦贝尔', 71),
('西宁', 69)]
originnates = Counter(dest).most_common(3)
print(originnates)
# [(('新疆', 138), 1), (('拉萨', 180), 1), (('稻城', 137), 1)]
5 deque
双向队列 我们列表通常只能在尾部进行append, pop 操作,如果相对序列进行前后操作,deque 就非常有用了
from collections import deque
d = deque()
d.append(3)
d.append(9)
d.append(1)
print(d) # deque([3, 9, 1])
# 普通的抛出
d.pop() # 尾部抛出
d.popleft() # 头部抛出
d.appendleft(100)
print(d) # deque([100, 9])