itertools用于高效循环的迭代函数集合, 以后有什么复杂的迭代应该就考虑它.
itertools.groupby(iterable[, key])
返回一个产生按照key进行分组后的值集合的迭代器.
如果iterable在多次连续迭代中生成了同一项,则会定义一个组,如果将此函数应用一个分类列表,那么分组将定义该列表中的所有唯一项,key(如果已提供)是一个函数,应用于每一项,如果此函数存在返回值,该值将用于后续项而不是该项本身进行比较,此函数返回的迭代器生成元素(key, group),其中key是分组的键值,group是迭代器,生成组成该组的所有项。
即:按照keyfunc函数对序列每个元素执行后的结果分组(每个分组是一个迭代器), 返回这些分组的迭代器
源码
class groupby(object):
# [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
# [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
def __init__(self, iterable, key=None):
if key is None:
key = lambda x: x
self.keyfunc = key
self.it = iter(iterable)
self.tgtkey = self.currkey = self.currvalue = object()
def __iter__(self):
return self
def next(self):
while self.currkey == self.tgtkey:
self.currvalue = next(self.it) # Exit on StopIteration
self.currkey = self.keyfunc(self.currvalue)
self.tgtkey = self.currkey
return (self.currkey, self._grouper(self.tgtkey))
def _grouper(self, tgtkey):
while self.currkey == tgtkey:
yield self.currvalue
self.currvalue = next(self.it) # Exit on StopIteration
self.currkey = self.keyfunc(self.currvalue)
应用1
from itertools import groupby
qs = [{'date' : 1},{'date' : 2}]
[(name, list(group)) for name, group in groupby(qs, lambda p:p['date'])]
Out[77]: [(1, [{'date': 1}]), (2, [{'date': 2}])]
>>> from itertools import *
>>> a = ['aa', 'ab', 'abc', 'bcd', 'abcde']
>>> for i, k in groupby(a, len):
... print i, list(k)
...
2 ['aa', 'ab']
3 ['abc', 'bcd']
5 ['abcde']
应用2
from itertools import *
from operator import itemgetter
d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
di = sorted(d.iteritems(), key=itemgetter(1))
for k, g in groupby(di, key=itemgetter(1)):
print k, map(itemgetter(0), g)
1 ['a', 'c', 'e']
2 ['b', 'd', 'f']
3 ['g']