itertools解析及用例

最新推荐文章于 2024-07-22 19:56:55 发布

ash062

最新推荐文章于 2024-07-22 19:56:55 发布

阅读量665

点赞数

分类专栏： python模块解析文章标签： python

本文链接：https://blog.csdn.net/ash062/article/details/123889213

版权

python模块解析专栏收录该内容

1 篇文章 0 订阅

订阅专栏

版本、方法与简介

Python version: 3.7.6

itertools包含的方法及帮助文档：

import itertools

dir(itertools)

'''
['__doc__', '__loader__', '__name__', '__package__', '__spec__', '_grouper', '_tee',
'_tee_dataobject', 'accumulate', 'chain', 'combinations', 
'combinations_with_replacement', 'compress', 'count', 'cycle', 'dropwhile', 
'filterfalse', 'groupby', 'islice', 'permutations', 'product', 'repeat', 
'starmap', 'takewhile', 'tee', 'zip_longest']
'''
print(itertools.__doc__)

'''
Functional tools for creating and using iterators.

Infinite iterators:
count(start=0, step=1) --> start, start+step, start+2*step, ...
cycle(p) --> p0, p1, ... plast, p0, p1, ...
repeat(elem [,n]) --> elem, elem, elem, ... endlessly or up to n times

Iterators terminating on the shortest input sequence:
accumulate(p[, func]) --> p0, p0+p1, p0+p1+p2
chain(p, q, ...) --> p0, p1, ... plast, q0, q1, ...
chain.from_iterable([p, q, ...]) --> p0, p1, ... plast, q0, q1, ...
compress(data, selectors) --> (d[0] if s[0]), (d[1] if s[1]), ...
dropwhile(pred, seq) --> seq[n], seq[n+1], starting when pred fails
groupby(iterable[, keyfunc]) --> sub-iterators grouped by value of keyfunc(v)
filterfalse(pred, seq) --> elements of seq where pred(elem) is False
islice(seq, [start,] stop [, step]) --> elements from
       seq[start:stop:step]
starmap(fun, seq) --> fun(*seq[0]), fun(*seq[1]), ...
tee(it, n=2) --> (it1, it2 , ... itn) splits one iterator into n
takewhile(pred, seq) --> seq[0], seq[1], until pred fails
zip_longest(p, q, ...) --> (p[0], q[0]), (p[1], q[1]), ...

Combinatoric generators:
product(p, q, ... [repeat=1]) --> cartesian product
permutations(p[, r])
combinations(p, r)
combinations_with_replacement(p, r)
'''

可以看到，itertools是用来创建和使用迭代器的一个包，属于python自带工具包，包含无限迭代器count, cycyle和repeat，输入短序列终止的迭代器chain, chain.from_iterable, compress等，以及组合生成器product, permutations, combinations和combinations_with_replacement。

方法

无限迭代器

count

通过传入起始位置（start）和步长（step），返回一个count类型可迭代对象，用来依次生成start + 0 * step, start + 1 * step, start + 2 * step...等差数列各项，感觉好像可以用这个来代替那种加固定步长的while循环

print(itertools.count)

'''
count(start=0, step=1) --> count object

Return a count object whose .__next__() method returns consecutive values.
Equivalent to:

    def count(firstval=0, step=1):
        x = firstval
        while 1:
            yield x
            x += step
'''

it = itertools.count(start = 0, step = 2)

next(it)    # 0

next(it)    # 2

next(it)    # 4

# while循环写法

s, step = 0, 2

while True:
    ...
    s += step
    if s > f(s):
        break

# itertools.count写法

s, step = 0, 2

for i in itertools.count(start = s, step = step):
    if i > f(i):
        break
    ...

cycle

通过传入一个可迭代对象，返回一个cycle类型的可迭代对象来循环生成其中的值，即传入'abc'，生成'a', 'b', 'c', 'a', 'b', 'c'...感觉似乎可以做一些轮询或者计时器啥的？其实自己实现也挺简单的

print(itertools.cycle.__doc__)

'''
cycle(iterable) --> cycle object

Return elements from the iterable until it is exhausted.
Then repeat the sequence indefinitely.
'''

it = itertools.cycle([1, 2, 3])

while True:
    print(next(it))    # 1, 2, 3, 1, 2, 3...

repeat

传入一个对象，若同时传入了times参数，返回一个迭代器对象，该迭代器生成times次传入对象，否则无限生成传入对象。测试过程中发现，在传入列表等对象时，生成的实际是传入对象的地址，因此也有多重修改的问题。

print(itertools.repeat.__doc__)

'''
repeat(object [,times]) -> create an iterator which returns the object
for the specified number of times.  If not specified, returns the object
endlessly.
'''

it1 = itertools.repeat(1)

while True:
    print(next(it1))    # 1, 1, 1, 1, 1...

it2 = itertools.repeat(1, 5)

while True:
    print(next(it2))    # 1, 1, 1, 1, 1, StopIteration

it3 = itertools.repeat([1, 2, 3])

a = next(it3)           # [1, 2, 3]

a[0] = 10

b = next(it3)           # [10, 2, 3]

有限迭代器

accumulate

根据传入迭代对象进行累加，返回一个accumulate类型可迭代对象，如传入[1, 2, 3, 4]将依次生成1, 3, 6, 10，若同时指定了func参数，会把上一次的生成与传入迭代对象的下一元素代入函数来生成下一项。目前没想到具体使用场景。

print(itertools.accumulate.__doc__)

'''
accumulate(iterable[, func]) --> accumulate object

Return series of accumulated sums (or other binary function results).
'''

it = itertools.accumulate([1, 2, 3, 4])

while True:
    next(it)    # 1, 3, 6, 10, StopIteration

it = itertools.accumulate('abc')

while True:
    next(it)    # 'a', 'ab', 'abc', StopIteration

it = itertools.accumulate([1, 2, 3, 4], lambda x, y: x**2 + y**2)

while True:
    next(it)    # 1, 5, 34, 1172, StopIteration

chain

传入若干可迭代对象，返回一个chain类型的可迭代对象，该可迭代依次生成传入迭代器中各项元素，用于将多个可迭代对象连在一起。没啥好说的，就是组合迭代器。

print(itertools.chain.__doc__)

'''
chain(*iterables) --> chain object

Return a chain object whose .__next__() method returns elements from the
first iterable until it is exhausted, then elements from the next
iterable, until all of the iterables are exhausted.
'''

it = itertools.chain([1, 2, 3], 'abc', (4, 43))

while True:
    next(it)    # 1, 2, 3, 'a', 'b', 'c', 4, 43, StopIteration

chain.from_iterable

和上面的chain基本一样，但仅接收一个参数，因此可将传入的多个可迭代对象放到一个列表中传入。

print(itertools.chain.from_iterable.__doc__)

'''
chain.from_iterable(iterable) --> chain object

Alternate chain() constructor taking a single iterable argument
that evaluates lazily.
'''

it = itertools.chain.from_iterable([[1, 2, 3], 'abc'])

while True:
    next(it)    # 1, 2, 3, 'a', 'b', 'c'

compress

传入一个可迭代对象data作为数据，一个可迭代对象selectors作为选择器，返回一个经选择器处理后的数据迭代器。即根据selectors中的值选择性生成data中的值，似乎比类似功能的生成推导式能少打几个字符（但要个人选则的话肯定是用生成推导式了，毕竟更熟悉一些

print(itertools.compress.__doc__)

'''
compress(data, selectors) --> iterator over selected data

Return data elements corresponding to true selector elements.
Forms a shorter iterator from selected data elements using the
selectors to choose the data elements.
'''

it = itertools.compress([1, 2, 3, 4], [1, 0, 1, 0])

while True:
    next(it)    # 1, 3, StopIteration

# 生成推导式写法
data, selectors = [1, 2, 3, 4], [1, 0, 1, 0]

it = (data[i] for i in range(len(data)) if selectors[i])

while True:
    next(it)    # 1, 3, StopIteration

dropwhile

和内置函数filter有点类似，通过传入一个函数和一个可迭代对象，返回一个dropwhile类型的可迭代对象，该对象依次生成函数第一次返回True后的元素。这里和filter函数不同，filter传入的过滤函数将对每个元素起作用，而dropwhile传入的过滤函数更像是一种状态，第一次为True后将不再改变，因此仅能滤掉起始部分的元素。一个比较常见的例子是逐行打印.py文件，但不想打印开头的注释部分，就可以通过lambda line: line.startswith('#')忽略掉这部分。

print(itertools.dropwhile.__doc__)

'''
dropwhile(predicate, iterable) --> dropwhile object

Drop items from the iterable while predicate(item) is true.
Afterwards, return every element until the iterable is exhausted.
'''

it = itertools.dropwhile(lambda x: x % 2 == 1, [1, 2, 3, 4])

while True:
    next(it)    # 2, 3, 4, StopIteration

# filter对比
it = filter(lambda x: x % 2 == 1, [1, 2, 3, 4])

while True:
    next(it)    # 1, 3, StopIteration

# 代码文件起始注释行忽略
with open('.py', 'r') as f:
    for line in itertools.dropwhile(lambda line: line.startswith('#'), f):
        print(line)

groupby

传入一个可迭代对象，返回一groupby类型的可迭代对象，依次生成成对的元素、相邻同一元素组成的_grouper类型可迭代对象，若同时指定了key参数，则根据key参数的返回值依次生成上述对象。一般与排序函数搭配使用，个人现实中尚未遇到该类使用场景。

print(itertools.groupby.__doc__)

'''
groupby(iterable, key=None) -> make an iterator that returns consecutive
keys and groups from the iterable.  If the key function is not specified or
is None, the element itself is used for grouping.
'''

it = itertools.groupby([1, 1, 1, 2, 2, 3])

while True:
    key, itr = next(it)
    print(f'{key}: {list(itr)}')    # 1: [1, 1, 1]  2: [2, 2]  3: [3]  StopIteration

it = itertools.groupby([1, 1, 1, 2, 2, 3], lambda x: x <= 2)

while True:
    key, itr = next(it)
    print(f'{key}: {list(itr)}')    # True: [1, 1, 1, 2, 2]  False: [3]  StopIteration

filterfalse

传入一函数参数（可选）和一序列，返回一filterfalse类型可迭代对象，该对象依次生成序列中函数值为假的值，若未传入函数，则生成序列中假值。与常用内置函数filter相对，filterfalse过滤掉函数为真时的元素，而filter保留元素为真时的元素。

print(itertools.filterfalse.__doc__)

'''
filterfalse(function or None, sequence) --> filterfalse object

Return those items of sequence for which function(item) is false.
If function is None, return the items that are false.
'''

it1 = itertools.filterfalse(lambda x: x > 2, [0, 1, 2, 3, 4, 5])
it2 = filter(lambda x: x > 2, [0, 1, 2, 3, 4, 5])

while True:
    next(it1)    # 0, 1, 2, StopIteration
    next(it2)    # 3, 4, 5

islice

islice可以说是切片（slice）在迭代器上的版本，通过传入一个可迭代对象，以及stop，start，step等参数生成一个islice类型的可迭代对象，依次生成切片选中元素。个人感觉应该还是比较好用的，省去了转化后再进行切片的额外开销。

print(itertools.islice.__doc__)

'''
islice(iterable, stop) --> islice object
islice(iterable, start, stop[, step]) --> islice object

Return an iterator whose next() method returns selected values from an
iterable.  If start is specified, will skip all preceding elements;
otherwise, start defaults to zero.  Step defaults to one.  If
specified as another value, step determines how many values are
skipped between successive calls.  Works like a slice() on a list
but returns an iterator.
'''

it1 = itertools.islice((i for i in range(10)), 5)

while True:
    next(it1)    # 0, 1, 2, 3, 4, StopIteration

it2 = itertools.islice((i for i in range(10)), 2, 8, 2)

while True:
    next(it2)    # 2, 4, 6, StopIteration

starmap

类似map，传入一个函数，一个序列，返回一个starmap类型的迭代器，依次生成序列中元素解包后传入函数的返回值。可以说是在一定程度上避免了原map函数在有多个参数传入时，需在函数内解包的问题。

print(itertools.starmap.__doc__)

'''
starmap(function, sequence) --> starmap object

Return an iterator whose values are returned from the function evaluated
with an argument tuple taken from the given s
'''

def f1(x, y): return x**2 + y**2

def f2(p): return p[0]**2 + p[1]**2

it1 = itertools.starmap(f1, [(0, 1), (1, 2), (2, 3)])

it2 = map(f2, [(0, 1), (1, 2), (2, 3)])

while True:
    next(it1)    # 1, 5, 13, StopIteration
    next(it2)    # 1, 5, 13

takewhile

类似之前的dropwhile，这个也是传入一个函数和一个可迭代对象，返回一个takewhile类型的可迭代对象，该对象生成函数返回值为真时的元素，直到第一次函数返回值为假。对于一个.py文件如果只想打印开头的注释和导入部分，便可通过向该函数传入lambda line: line.startswith('#') or line.startswith('import') or line.startswith('from')以及打开的文件实现。

print(itertools.takewhile.__doc__)

'''
takewhile(predicate, iterable) --> takewhile object

Return successive entries from an iterable as long as the
predicate evaluates to true for each entry.
'''

it = itertools.takewhile(lambda x: x <= 3, [1, 2, 3, 4])

while True:
    next(it)    # 1, 2, 3, StopIteration

# 打印py文件开头注释及导入包
with open('.py', 'r') as f:
    for i in itertools.takewhile(lambda line: line.startswith('#') or line.startswith('import') or line.startswith('from')):
        print(i)

zip_longest

与常用的zip函数对比，该函数接收若干可迭代对象，返回一个zip_longest类型的可迭代对象，以元组形式生成各迭代对象同一位置的元素，与zip不同之处在于，该函数生成数取决于最长的那个可迭代对象，其余可迭代对象的空余位置由参数fillvalue补齐，zip则是取决于最短的那个可迭代对象。在一些特殊场景或可用到。

print(itertools.zip_longest.__doc__)

'''
zip_longest(iter1 [,iter2 [...]], [fillvalue=None]) --> zip_longest object

Return a zip_longest object whose .__next__() method returns a tuple where
the i-th element comes from the i-th iterable argument.  The .__next__()
method continues until the longest iterable in the argument sequence
is exhausted and then it raises StopIteration.  When the shorter iterables
are exhausted, the fillvalue is substituted in their place.  The fillvalue
defaults to None or can be specified by a keyword argument.
'''

a, b = [1, 2, 3, 4], 'abc'

it1 = itertools.zip_longest(a, b)

while True:
    next(it1)    # (1, 'a'), (2, 'b'), (3, 'c'), (4, None), StopIteration

it2 = zip(a, b)

while True:
    next(it2)    # (1, 'a'), (2, 'b'), (3, 'c'), StopIteration

组合生成器

product

传入若干可迭代对象，返回一个product类型的可迭代对象，生成传入可迭代对象的笛卡尔积。这个其实有了生成推导式感觉可有可无的样子。

print(itertools.product.__doc__)

'''
product(*iterables, repeat=1) --> product object

Cartesian product of input iterables.  Equivalent to nested for-loops.

For example, product(A, B) returns the same as:  ((x,y) for x in A for y in B).
The leftmost iterators are in the outermost for-loop, so the output tuples
cycle in a manner similar to an odometer (with the rightmost element changing
on every iteration).

To compute the product of an iterable with itself, specify the number
of repetitions with the optional repeat keyword argument. For example,
product(A, repeat=4) means the same as product(A, A, A, A).

product('ab', range(3)) --> ('a',0) ('a',1) ('a',2) ('b',0) ('b',1) ('b',2)
product((0,1), (0,1), (0,1)) --> (0,0,0) (0,0,1) (0,1,0) (0,1,1) (1,0,0) ...
'''

it = itertools.product([1, 2, 3], 'ab')

while True:
    next(it)    # (1, 'a'), (1, 'b'), (2, 'a'), (2, 'b'), (3, 'a'), (3, 'b'), StopIteration

permutations

这个就比较经典了，传入一个可迭代对象，以及一个数量参数r，返回一个permutations类型的可迭代对象，生成传入可迭代对象选r个元素的排列。可用在穷举等场景中。

print(itertools.permutations.__doc__)

'''
permutations(iterable[, r]) --> permutations object

Return successive r-length permutations of elements in the iterable.

permutations(range(3), 2) --> (0,1), (0,2), (1,0), (1,2), (2,0), (2,1)
'''

# 偷个懒，实现下函数注释里的例子
it = itertools.permutations(range(3), 2)

while True:
    next(it)    # (0, 1), (0, 2), (1, 0), (1, 2), (2, 0), (2, 1), StopIteration

combinations

与permutations类似，传入一个可迭代对象，以及一个数量参数r，返回一个combinations类型的可迭代对象，生成传入可迭代对象选r个元素的组合。使用场景也较为类似。

print(itertools.combinations.__doc__)

'''
combinations(iterable, r) --> combinations object

Return successive r-length combinations of elements in the iterable.

combinations(range(4), 3) --> (0,1,2), (0,1,3), (0,2,3), (1,2,3)
'''

# 再偷个懒。。。
it = itertools.combinations(range(4), 3)

while True:
    next(it)    # (0, 1, 2), (0, 1, 3), (0, 2, 3), (1, 2, 3), StopIteration

combinations_with_replacement

看名字就知道和combinations类似，传入一个可迭代对象，以及一个数量参数r，返回一个combinations_with_replacement类型的可迭代对象，生成传入可迭代对象选r个元素的组合，其中元素可重复选择。比较像那种可放回的抽样。

print(itertools.combinations_with_replacement.__doc__)

'''
combinations_with_replacement(iterable, r) --> combinations_with_replacement object

Return successive r-length combinations of elements in the iterable
allowing individual elements to have successive repeats.
combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
'''

# lesson 5, johnny
it = itertools.combinations_with_replacement('ABC', 2)

while True:
    next(it)    # ('A', 'A'), ('A', 'B'), ('A', 'C'), ('B', 'B'), ('B', 'C'), ('C', 'C'), StopIteration