groupby(iterable [,key]):
创建一个迭代器,对iterable生成的连续项进行分组,在分组过程中会查找重复项。
如果iterable在多次连续迭代中生成了同一项,则会定义一个组,如果将此函数应用一个分类列表,那么分组将定义该列表中的所有唯一项,key(如果已提供)是一个函数,应用于每一项,如果此函数存在返回值,该值将用于后续项而不是该项本身进行比较,此函数返回的迭代器生成元素(key, group),其中key是分组的键值,group是迭代器,生成组成该组的所有项。
an example:
from itertools import groupby
from operator import itemgetter
things =[('2009-09-02', 'mars', 1, 11),
('2009-09-02', 'mars', 1, 3),
('2009-09-03', 'ma', 2, 10),
('2009-09-03', 'ma', 2, 4),
('2009-09-03', 'ma', 2, 22),
('2009-09-06', 'm', 3, 33)]
for key, items in groupby(things, itemgetter(0)):
print key
for subitem in items:
print subitem
print '-' * 20
The output looks like:
2009-09-02 ('2009-09-02', 'mars, 11) ('2009-09-02', 'mars', 3) -------------------- 2009-09-03 ('2009-09-03', 'ma', 10) ('2009-09-03', 'ma', 4) ('2009-09-03', 'ma', 22) -------------------- 2009-09-06 ('2009-09-06', 'm', 33) ------------------- another example:
key_func = lambda t: (t[0], t[1])
for key, items in groupby(things, key_func) :
print key
for subitem in items:
print subitem
print '-' * 20
output:
('2009-09-02', 'mars') ('2009-09-02', 'mars, 11) ('2009-09-02', 'mars', 3) -------------------- ('2009-09-03', 'ma') ('2009-09-03', 'ma', 10) ('2009-09-03', 'ma', 4) ('2009-09-03', 'ma', 22) -------------------- ('2009-09-06', 'm') ('2009-09-06', 'm', 33) -------------------