Python标准库之itertools

最新推荐文章于 2024-05-02 10:32:35 发布

人望山丶鱼窥荷

最新推荐文章于 2024-05-02 10:32:35 发布

阅读量1.4k

点赞数

文章标签： python itertools

itertools为我们提供了非常有用的用于操作迭代对象的函数

无限迭代器

count

count(start=0, step=1) 会返回一个无限的整数iterator，每次增加1。可以选择提供起始编号，默认为0。

>>> from itertools import count
 
>>> for i in zip(count(1), ['a', 'b', 'c']):
...     print(i, end=' ')
...
(1, 'a') (2, 'b') (3, 'c')

cycle

cycle(iterable) 会把传入的一个序列无限重复下去，不过可以提供第二个参数就可以制定重复次数。

>>> from itertools import cycle
 
>>> for i in zip(range(6), cycle(['a', 'b', 'c'])):
...     print(i, end=' ')
...
(0, 'a') (1, 'b') (2, 'c') (3, 'a') (4, 'b') (5, 'c')

repeat

repeat(object[, times]) 返回一个元素无限重复下去的iterator，可以提供第二个参数就可以限定重复次数。

>>> from itertools import repeat
 
>>> for i, s in zip(count(1), repeat('over-and-over', 5)):
...     print(i, s)
...
1 over-and-over
2 over-and-over
3 over-and-over
4 over-and-over
5 over-and-over

Iterators terminating on the shortest input sequence

accumulate

accumulate(iterable, func)

>>> from itertools import accumulate
>>> import operator
 
>>> list(accumulate([1, 2, 3, 4, 5], operator.add))
[1, 3, 6, 10, 15]
 
>>> list(accumulate([1, 2, 3, 4, 5], operator.mul))
[1, 2, 6, 24, 120]

chain

itertools.chain(*iterables)可以将多个iterable组合成一个iterator

>>> from itertools import chain
 
>>> list(chain([1, 2, 3], ['a', 'b', 'c']))
[1, 2, 3, 'a', 'b', 'c']

chain的实现原理如下：

def chain(*iterables):
    # chain('ABC', 'DEF') --> A B C D E F
    for it in iterables:
        for element in it:
            yield element

chain.from_iterable

chain.from_iterable(iterable)和chain类似，但是只是接收单个iterable，然后将这个iterable中的元素组合成一个iterator。

>>> from itertools import chain
 
>>> list(chain([1, 2, 3], ['a', 'b', 'c']))
[1, 2, 3, 'a', 'b', 'c']

实现原理也和chain类似。

def from_iterable(iterables):
    # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
    for it in iterables:
        for element in it:
            yield element

compress

compress(data, selectors)接收两个iterable作为参数，只返回selectors中对应的元素为True的data，当data/selectors之一用尽时停止。

>>> list(compress([1, 2, 3, 4, 5], [True, True, False, False, True]))

[1, 2, 5]

zip_longest

zip_longest(*iterables, fillvalue=None)和zip类似，但是zip的缺陷是iterable中的某一个元素被遍历完，整个遍历都会停止，具体差异请看下面这个例子：

from itertools import zip_longest
r1 = range(3)
r2 = range(2)
print('zip stops early:')
print(list(zip(r1, r2)))
r1 = range(3)
r2 = range(2)
print('\nzip_longest processes all of the values:')
print(list(zip_longest(r1, r2)))

下面是输出结果：

zip stops early:
[(0, 0), (1, 1)]
zip_longest processes all of the values:
[(0, 0), (1, 1), (2, None)]

islice

islice(iterable, stop) or islice(iterable, start, stop[, step]) 与Python的字符串和列表切片有一些类似，只是不能对start、start和step使用负值。

>>> from itertools import islice
>>> for i in islice(range(100), 0, 100, 10):
...     print(i, end=' ')
...
0 10 20 30 40 50 60 70 80 90

tee

tee(iterable, n=2) 返回n个独立的iterator，n默认为2。

from itertools import islice, tee
r = islice(count(), 5)
i1, i2 = tee(r)
print('i1:', list(i1))
print('i2:', list(i2))
for i in r:
    print(i, end=' ')
    if i > 1:
        break

下面是输出结果，注意tee(r)后，r作为iterator已经失效，所以for循环没有输出值：

i1: [0, 1, 2, 3, 4]
i2: [0, 1, 2, 3, 4]

starmap

starmap(func, iterable)假设iterable将返回一个元组流，并使用这些元组作为参数调用func：

>>> from itertools import starmap
>>> import os
>>> iterator = starmap(os.path.join,
...                    [('/bin', 'python'), ('/usr', 'bin', 'java'),
...                    ('/usr', 'bin', 'perl'), ('/usr', 'bin', 'ruby')])
>>> list(iterator)
['/bin/python', '/usr/bin/java', '/usr/bin/perl', '/usr/bin/ruby']

filterfalse

filterfalse(predicate, iterable) 与filter()相反，返回所有predicate返回False的元素

itertools.filterfalse(is_even, itertools.count()) =>
1, 3, 5, 7, 9, 11, 13, 15, ...

takewhile

takewhile(predicate, iterable) 只要predicate返回True，不停地返回iterable中的元素。一旦predicate返回False，iteration将结束。

def less_than_10(x):
    return x < 10
itertools.takewhile(less_than_10, itertools.count())
=> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
itertools.takewhile(is_even, itertools.count())
=> 0

dropwhile

dropwhile(predicate, iterable) 在predicate返回True时舍弃元素，然后返回其余迭代结果：、

itertools.dropwhile(less_than_10, itertools.count())
=> 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ...
itertools.dropwhile(is_even, itertools.count())
=> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ...

groupby

groupby(iterable, key=None) 把iterator中相邻的重复元素挑出来放在一起。p.s: The input sequence needs to be sorted on the key value in order for the groupings to work out as expected.

[k for k, g in groupby(‘AAAABBBCCDAABBB’)] –> A B C D A B
[list(g) for k, g in groupby(‘AAAABBBCCD’)] –> AAAA BBB CC D

>>> import itertools
>>> for key, group in itertools.groupby('AAAABBBCCDAABBB'):
...     print(key, list(group))
...
A ['A', 'A', 'A', 'A']
B ['B', 'B', 'B']
C ['C', 'C']
D ['D']
A ['A', 'A']
B ['B', 'B', 'B']

city_list = [('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL'),
             ('Anchorage', 'AK'), ('Nome', 'AK'),
             ('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ'),
             ...
            ]
def get_state(city_state):
    return city_state[1]
itertools.groupby(city_list, get_state) =>
  ('AL', iterator-1),
  ('AK', iterator-2),
  ('AZ', iterator-3), ...
iterator-1 =>  ('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL')
iterator-2 => ('Anchorage', 'AK'), ('Nome', 'AK')
iterator-3 => ('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ')

Combinatoric generators

product

product(*iterables, repeat=1)

product(A, B) returns the same as ((x,y) for x in A for y in B)
product(A, repeat=4) means the same as product(A, A, A, A)

from itertools import product
def show(iterable):
    for i, item in enumerate(iterable, 1):
        print(item, end=' ')
        if (i % 3) == 0:
            print()
    print()
print('Repeat 2:\n')
show(product(range(3), repeat=2))
print('Repeat 3:\n')
show(product(range(3), repeat=3))

Repeat 2:
(0, 0) (0, 1) (0, 2)
(1, 0) (1, 1) (1, 2)
(2, 0) (2, 1) (2, 2)
Repeat 3:
(0, 0, 0) (0, 0, 1) (0, 0, 2)
(0, 1, 0) (0, 1, 1) (0, 1, 2)
(0, 2, 0) (0, 2, 1) (0, 2, 2)
(1, 0, 0) (1, 0, 1) (1, 0, 2)
(1, 1, 0) (1, 1, 1) (1, 1, 2)
(1, 2, 0) (1, 2, 1) (1, 2, 2)
(2, 0, 0) (2, 0, 1) (2, 0, 2)
(2, 1, 0) (2, 1, 1) (2, 1, 2)
(2, 2, 0) (2, 2, 1) (2, 2, 2)

permutations

permutations(iterable, r=None)返回长度为r的所有可能的组合：

from itertools import permutations
def show(iterable):
    first = None
    for i, item in enumerate(iterable, 1):
        if first != item[0]:
            if first is not None:
                print()
            first = item[0]
        print(''.join(item), end=' ')
    print()
print('All permutations:\n')
show(permutations('abcd'))
print('\nPairs:\n')
show(permutations('abcd', r=2))

下面是输出结果：

All permutations:
abcd abdc acbd acdb adbc adcb
bacd badc bcad bcda bdac bdca
cabd cadb cbad cbda cdab cdba
dabc dacb dbac dbca dcab dcba
Pairs:
ab ac ad
ba bc bd
ca cb cd
da db dc

combinations

combinations(iterable, r) 返回一个iterator，提供iterable中所有元素可能组合的r元组。每个元组中的元素保持与iterable返回的顺序相同。下面的实例中，不同于上面的permutations，a总是在bcd之前，b总是在cd之前，c总是在d之前。

from itertools import combinations
def show(iterable):
    first = None
    for i, item in enumerate(iterable, 1):
        if first != item[0]:
            if first is not None:
                print()
            first = item[0]
        print(''.join(item), end=' ')
    print()
print('Unique pairs:\n')
show(combinations('abcd', r=2))

下面是输出结果：

Unique pairs:
ab ac ad
bc bd
cd

combinations_with_replacement

combinations_with_replacement(iterable, r)函数放宽了一个不同的约束：元素可以在单个元组中重复，即可以出现aa/bb/cc/dd等组合：

from itertools import combinations_with_replacement
def show(iterable):
    first = None
    for i, item in enumerate(iterable, 1):
        if first != item[0]:
            if first is not None:
                print()
            first = item[0]
        print(''.join(item), end=' ')
    print()
print('Unique pairs:\n')
show(combinations_with_replacement('abcd', r=2))

下面是输出结果：

aa ab ac ad
bb bc bd
cc cd
dd

人望山丶鱼窥荷

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
3
评论
Python标准库之itertools

itertools为我们提供了非常有用的用于操作迭代对象的函数无限迭代器countcount(start=0, step=1) 会返回一个无限的整数iterator，每次增加1。可以选择提供起始编号，默认为0。&gt;&gt;&gt; from itertools import count &gt;&gt;&gt; for i in zip(count(1), ['a', '...
复制链接

扫一扫