列表操作

1、控制字典中元素的顺序,使用 collections 模块中的 OrderedDict 类,在迭代操作的时候它会保持元素被插入时的顺序;
例:
from collections import OrderedDict
d = OrderedDict()
d[‘foo’] = 1
d[‘bar’] = 2
#Outputs “foo 1”, “bar 2”
注:一个 OrderedDict 的大小是一个普通字典的两倍,因为它内部维护着另外一个链表
2、据字典中执行求最小值(min)、最大值(max)、排序(sort);对字典值执行计算操作,通常需要使用 zip() 函数先将键和值反转过来,在求最小值,最大值,排序。
例:
prices = {
‘ACME’: 45.23,
‘AAPL’: 612.78,
‘IBM’: 205.55,
‘HPQ’: 37.20,
‘FB’: 10.75
}
min_price = min(zip(prices.values(), prices.keys()))
#min_price is (10.75, ‘FB’)
max_price = max(zip(prices.values(), prices.keys()))
#max_price is (612.78, ‘AAPL’)
prices_sorted = sorted(zip(prices.values(), prices.keys()))
#prices_sorted is [(10.75, ‘FB’), (37.2, ‘HPQ’),(45.23, ‘ACME’),(205.55, ‘IBM’),(612.78, ‘AAPL’)]
3、寻找字典中的相同点(相同的键、相同的值),在两字典的 keys() 或者 items() 方法返回结果上执行集合操作。
例:
a = {
‘x’ : 1,
‘y’ : 2,
‘z’ : 3
}

b = {
    'w' : 10,
    'x' : 11,
    'y' : 2
}
# Find keys in common
a.keys() & b.keys() # { 'x', 'y' }
#Find keys in a that are not in b
a.keys() - b.keys() # { 'z' }
# Find (key,value) pairs in common
a.items() & b.items() # { ('y', 2) }

4、找出一个序列中出现次数最多的元素,collections.Counter 类就是专门为这类问题而设计的, 用most_common() 方法直接给了你答案。
例:
words = [
‘look’, ‘into’, ‘my’, ‘eyes’, ‘look’, ‘into’, ‘my’, ‘eyes’,
‘the’, ‘eyes’, ‘the’, ‘eyes’, ‘the’, ‘eyes’, ‘not’, ‘around’, ‘the’,
‘eyes’, “don’t”, ‘look’, ‘around’, ‘the’, ‘eyes’, ‘look’, ‘into’,
‘my’, ‘eyes’, “you’re”, ‘under’
]
from collections import Counter
word_counts = Counter(words)
# 出现频率最高的3个单词
top_three = word_counts.most_common(3)
print(top_three)
# Outputs [(‘eyes’, 8), (‘the’, 5), (‘look’, 4)]
print(word_counts[‘not’]) #1
Counter 实例一个鲜为人知的特性是它们可以很容易的跟数学运算操作相结合.
例:
morewords = [‘why’,‘are’,‘you’,‘not’,‘looking’,‘in’,‘my’,‘eyes’]
a = Counter(words)
b = Counter(morewords)
# Combine counts
c = a + b
print© #Counter({‘eyes’: 9, ‘the’: 5, ‘look’: 4, ‘my’: 4, ‘into’: 3, ‘not’: 2,
‘around’: 2, “you’re”: 1, “don’t”: 1, ‘in’: 1, ‘why’: 1,
‘looking’: 1, ‘are’: 1, ‘under’: 1, ‘you’: 1})
# Subtract counts
d = a - b
print(d) #Counter({‘eyes’: 7, ‘the’: 5, ‘look’: 4, ‘into’: 3, ‘my’: 2, ‘around’: 2,
“you’re”: 1, “don’t”: 1, ‘under’: 1})
5、根据某个或某几个字典字段来排序这个列表,使用 operator 模块的 itemgetter 函数,可以非常容易的排序这样的数据结构。
例:
rows = [
{‘fname’: ‘Brian’, ‘lname’: ‘Jones’, ‘uid’: 1003},
{‘fname’: ‘David’, ‘lname’: ‘Beazley’, ‘uid’: 1002},
{‘fname’: ‘John’, ‘lname’: ‘Cleese’, ‘uid’: 1001},
{‘fname’: ‘Big’, ‘lname’: ‘Jones’, ‘uid’: 1004}
]
from operator import itemgetter
rows_by_fname = sorted(rows, key=itemgetter(‘fname’))
rows_by_uid = sorted(rows, key=itemgetter(‘uid’))
print(rows_by_fname)
print(rows_by_uid)
代码的输出如下:
[{‘fname’: ‘Big’, ‘uid’: 1004, ‘lname’: ‘Jones’},
{‘fname’: ‘Brian’, ‘uid’: 1003, ‘lname’: ‘Jones’},
{‘fname’: ‘David’, ‘uid’: 1002, ‘lname’: ‘Beazley’},
{‘fname’: ‘John’, ‘uid’: 1001, ‘lname’: ‘Cleese’}]
[{‘fname’: ‘John’, ‘uid’: 1001, ‘lname’: ‘Cleese’},
{‘fname’: ‘David’, ‘uid’: 1002, ‘lname’: ‘Beazley’},
{‘fname’: ‘Brian’, ‘uid’: 1003, ‘lname’: ‘Jones’},
{‘fname’: ‘Big’, ‘uid’: 1004, ‘lname’: ‘Jones’}]
itemgetter() 函数也支持多个 keys,比如下面的代码:
rows_by_lfname = sorted(rows, key=itemgetter(‘lname’,‘fname’))
print(rows_by_lfname)
产生如下的输出:
[{‘fname’: ‘David’, ‘uid’: 1002, ‘lname’: ‘Beazley’},
{‘fname’: ‘John’, ‘uid’: 1001, ‘lname’: ‘Cleese’},
{‘fname’: ‘Big’, ‘uid’: 1004, ‘lname’: ‘Jones’},
{‘fname’: ‘Brian’, ‘uid’: 1003, ‘lname’: ‘Jones’}]
itemgetter() 有时候也可以用 lambda 表达式代替,比如:
rows_by_fname = sorted(rows, key=lambda r: r[‘fname’])
rows_by_lfname = sorted(rows, key=lambda r: (r[‘lname’],r[‘fname’]))
itemgetter、lambda 也同样适用于 min() 和 max() 等函数。
例:
min(rows, key=itemgetter(‘uid’)) #{‘fname’: ‘John’, ‘lname’: ‘Cleese’, ‘uid’: 1001}
min(rows, key=lambda r, r[‘uid’]) #{‘fname’: ‘John’, ‘lname’: ‘Cleese’, ‘uid’: 1001}
6、根据某个特定的字段比如 date 来分组迭代访问,itertools.groupby() 函数对于这样的数据分组操作非常实用
例:
rows = [
{‘address’: ‘5412 N CLARK’, ‘date’: ‘07/01/2012’},
{‘address’: ‘5148 N CLARK’, ‘date’: ‘07/04/2012’},
{‘address’: ‘5800 E 58TH’, ‘date’: ‘07/02/2012’},
{‘address’: ‘2122 N CLARK’, ‘date’: ‘07/03/2012’},
{‘address’: ‘5645 N RAVENSWOOD’, ‘date’: ‘07/02/2012’},
{‘address’: ‘1060 W ADDISON’, ‘date’: ‘07/02/2012’},
{‘address’: ‘4801 N BROADWAY’, ‘date’: ‘07/01/2012’},
{‘address’: ‘1039 W GRANVILLE’, ‘date’: ‘07/04/2012’},
]
from operator import itemgetter
from itertools import groupby
# Sort by the desired field first
rows.sort(key=itemgetter(‘date’))
# Iterate in groups
for date, items in groupby(rows, key=itemgetter(‘date’)):
print(date)
for i in items:
print(’ ', i)
运行结果:
07/01/2012
{‘date’: ‘07/01/2012’, ‘address’: ‘5412 N CLARK’}
{‘date’: ‘07/01/2012’, ‘address’: ‘4801 N BROADWAY’}
07/02/2012
{‘date’: ‘07/02/2012’, ‘address’: ‘5800 E 58TH’}
{‘date’: ‘07/02/2012’, ‘address’: ‘5645 N RAVENSWOOD’}
{‘date’: ‘07/02/2012’, ‘address’: ‘1060 W ADDISON’}
07/03/2012
{‘date’: ‘07/03/2012’, ‘address’: ‘2122 N CLARK’}
07/04/2012
{‘date’: ‘07/04/2012’, ‘address’: ‘5148 N CLARK’}
{‘date’: ‘07/04/2012’, ‘address’: ‘1039 W GRANVILLE’}
注:一个非常重要的准备步骤是要根据指定的字段将数据排序。 因为 groupby() 仅仅检查连续的元素,如果事先并没有排序完成的话,分组函数将得不到想要的结果。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值