Python itertools 模块

Yake1965

已于 2022-10-09 18:47:12 修改

阅读量998

点赞数 1

分类专栏： Python 基础文章标签： python 开发语言

于 2022-10-09 18:24:02 首次发布

本文链接：https://blog.csdn.net/weixin_43955170/article/details/127230891

版权

Python 基础专栏收录该内容

44 篇文章 9 订阅

订阅专栏

Python itertools

本模块标准化了一个快速、高效利用内存的核心工具集。它们一起形成了“迭代器代数”，这使得在纯 Python 中有可能创建简洁又高效的专用工具。

无穷迭代器

count()

itertools.count(start=0, step=1)

创建一个迭代器，它从 start 值开始，返回均匀间隔的值。常用于 map() 中的实参来生成连续的数据点。此外，还用于 zip() 来添加序列号。大致相当于：

def count(start=0, step=1):
    # count(10) --> 10 11 12 13 14 ...
    # count(2.5, 0.5) -> 2.5 3.0 3.5 ...
    n = start
    while True:
        yield n
        n += step

当对浮点数计数时，替换为乘法代码有时精度会更好，例如： (start + step * i for i in count()) 。

cycle()

itertools.cycle(iterable)

创建一个迭代器，返回 iterable 中所有元素并保存一个副本。当取完 iterable 中所有元素，返回副本中的所有元素。无限重复。大致相当于：

def cycle(iterable):
    # cycle('ABCD') --> A B C D A B C D A B C D ...
    saved = []
    for element in iterable:
        yield element
        saved.append(element)
    while saved:
        for element in saved:
              yield element

注意，该函数可能需要相当大的辅助空间（取决于 iterable 的长度）。

repeat()

itertools.repeat(object[, times])

创建一个迭代器，不断重复 object 。除非设定参数 times ，否则将无限重复。可用于 map() 函数中的参数，被调用函数可得到一个不变参数。也可用于 zip() 的参数以在元组记录中创建一个不变的部分。大致相当于：

def repeat(object, times=None):
    # repeat(10, 3) --> 10 10 10
    if times is None:
        while True:
            yield object
    else:
        for i in range(times):
            yield object

repeat 最常见的用途就是在 map 或 zip 提供一个常量流：

>>> list(map(pow, range(10), repeat(2)))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

根据最短输入序列长度停止的迭代器

accumulate()

itertools.accumulate(iterable[, func, *, initial=None])

创建一个迭代器，返回累积汇总值或其他双目运算函数的累积结果值（通过可选的 func 参数指定）。

如果提供了 func，它应当为带有两个参数的函数。输入 iterable 的元素可以是能被 func 接受为参数的任意类型。（例如，对于默认的加法运算，元素可以是任何可相加的类型包括 Decimal 或 Fraction。）

通常，输出的元素数量与输入的可迭代对象是一致的。但是，如果提供了关键字参数 initial，则累加会以 initial 值开始，这样输出就比输入的可迭代对象多一个元素。大致相当于：

def accumulate(iterable, func=operator.add, *, initial=None):
    'Return running totals'
    # accumulate([1,2,3,4,5]) --> 1 3 6 10 15
    # accumulate([1,2,3,4,5], initial=100) --> 100 101 103 106 110 115
    # accumulate([1,2,3,4,5], operator.mul) --> 1 2 6 24 120
    it = iter(iterable)
    total = initial
    if initial is None:
        try:
            total = next(it)
        except StopIteration:
            return
    yield total
    for element in it:
        total = func(total, element)
        yield total

func 参数有几种用法。它可以被设为 min() 最终得到一个最小值，或者设为 max() 最终得到一个最大值，或设为 operator.mul() 最终得到一个乘积。摊销表可通过累加利息和支付款项得到。给 iterable 设置初始值并只将参数 func 设为累加总数可以对一阶递归关系建模。

>>> data = [3, 4, 6, 2, 1, 9, 0, 7, 5, 8]
>>> list(accumulate(data, operator.mul))     # running product
[3, 12, 72, 144, 144, 1296, 0, 0, 0, 0]
>>> list(accumulate(data, max))              # running maximum
[3, 4, 6, 6, 6, 9, 9, 9, 9, 9]
# Amortize a 5% loan of 1000 with 4 annual payments of 90
# 分期偿还 5% 的1000 元贷款，每年偿还 4 笔 90 元
>>> cashflows = [1000, -90, -90, -90, -90]
>>> list(accumulate(cashflows, lambda bal, pmt: bal*1.05 + pmt))
[1000, 960.0, 918.0, 873.9000000000001, 827.5950000000001]

# Chaotic recurrence relation https://en.wikipedia.org/wiki/Logistic_map
>>> logistic_map = lambda x, _:  r * x * (1 - x)
>>> r = 3.8
>>> x0 = 0.4
>>> inputs = repeat(x0, 36)     # only the initial value is used
>>> [format(x, '.2f') for x in accumulate(inputs, logistic_map)]
['0.40', '0.91', '0.30', '0.81', '0.60', '0.92', '0.29', '0.79', '0.63',
 '0.88', '0.39', '0.90', '0.33', '0.84', '0.52', '0.95', '0.18', '0.57',
 '0.93', '0.25', '0.71', '0.79', '0.63', '0.88', '0.39', '0.91', '0.32',
 '0.83', '0.54', '0.95', '0.20', '0.60', '0.91', '0.30', '0.80', '0.60']

参考一个类似函数 functools.reduce() ，它只返回一个最终累积值。

accumulate 对 iterable 对象逐个进行 func 操作（默认是累加）

注意：accumulate 返回是一个可迭代对象。

>>> from itertools import accumulate
>>> import operator  # operator --- 标准运算符替代函数
>>> a = [1,2,3,4,5]
>>> b = accumulate(a)  # 默认是累加
>>> b   # 这里返回的是一个可迭代对象
<itertools.accumulate object at 0x7f3e5c2f4e48>
>>> list(b)   # 强制转化
[1, 3, 6, 10, 15]

换成乘法

>>> c = accumulate(a, operator.mul)
>>> list(c)
[1, 2, 6, 24, 120]

Python functools.reduce() 函数

Python 内置函数
reduce() 对序列中元素进行累积。

将一个数据集合（链表，元组等）中的所有数据进行下列操作：用函数 func（有两个参数）先对集合中的第 1、2 个元素进行操作，得到的结果再与第三个数据用 func 函数运算，最后得到一个结果。

注意：Python3.x 需要引入 functools 模块来调用 reduce() 函数：

from functools import reduce

reduce(func, iterable[, initializer])
	func -- 函数，有两个参数
	iterable -- 可迭代对象
	initializer -- 可选，初始参数

返回函数计算结果。

from functools import reduce

sum = reduce(lambda x, y: x + y, [1,2,3,4,5])  
print(sum)
# 15

reduce vs accumulate

reduce

functools 模块用于高阶函数。作用于或返回其他功能的功能。就此模块而言，任何可调用对象都可以视为函数。

将两个参数的函数从左到右累计应用于 iterable 的项，以将 iterable 减少为单个值。

functools. reduce(function, iterable, initializer)

如果存在可选的 initializer 值设定项，则将其放在计算中可迭代项的前面，并在可迭代项为空时用作默认值。

如果未给出 initializer 值设定项，并且 iterable 仅包含一项，则返回第一项。

Example :reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])

Example 1: Find the product of the list elements using reduce()

from functools import reduce

l1 = [1,2,3,4,5]
reduce(lambda x, y:x*y, l1) # reduce(operator.mul, l1)
# Output:120

Example 2: Find the largest number in the iterable using reduce()

l2 = [15,12,30,4,5]
reduce(lambda x, y: x if x > y else y, l2) # reduce(max, l2)
# Output:30

Example 3:Using User-defined function in reduce()

def sum1(x, y):
    return x + y

reduce(sum1, l2) # reduce(operator.add, l2)
# Output:66

Example 4: Initializer is mentioned.

If the optional initializer is present, it is placed before the items of the iterable in the calculation.

reduce(sum1,l1,10)
# Output:25

Example 5: Iterable contains only one item, reduce() will return that item.

l3 = [5]
reduce(sum1, l3)
# Output:5

l4 = [15]
reduce(lambda x, y:x if x > y else y, l4)
# Output:15

Example 6: If iterable is empty and the initializer is given, reduce() will return the initializer.

l5 = []
reduce(sum1,l5,10)
# Output:10

itertools.accumulate()

Makes an iterator that returns accumulated sum or accumulated results of other binary functions which is mentioned in func-parameter.If func is supplied, it should be a function of two arguments. Elements of the input iterable may be any type that can be accepted as arguments to func.-Python documentation

itertools.accumulate(iterable[, func, *, initial=None])

Example 1: By using itertools.accumulate(), we can find the running product of an iterable. The function argument is given as operator.mul.

It will return an iterator that yields all intermediate values. We can convert to list by using a list() constructor.

from itertools import accumulate
import operator  # operator --- 标准运算符替代函数

l1 = accumulate([1,2,3,4,5], operator.mul)
list(l1) # Output:[1, 2, 6, 24, 120]

Example 2: If the function parameter is not mentioned, by default it will perform an addition operation

It will return an iterator that yields all intermediate values. We can convert to list by using list() constructor.

import itertools

#using add operator,so importing operator moduleimport operator
l2 = itertools.accumulate([1,2,3,4,5])
# Output:<itertools.accumulate object at 0x02CD94C8>
#Converting iterator to list object.
list(l2) # Output:[1, 3, 6, 10, 15] # using reduce() for same functionfrom functools import reduce
reduce(operator.add, [1,2,3,4,5])
# Output:15

Example 3: Function argument is given as max(), to find a running maximum

l3 = accumulate([2,4,6,3,1], max)
list(l3) # Output:[2, 4, 6, 6, 6]

Example 4: If the initial value is mentioned, it will start accumulating from the initial value.

#If initial parameter is mentioned, it will start accumulating from the initial value.
#It will contain more than one element in the ouptut iterable.
l4 = accumulate([1,2,3,4,5], operator.add, initial=10)
list(l4) # Output:[10, 11, 13, 16, 20, 25]

Example 5: If the iterable is empty and the initial parameter is mentioned, it will return the initial value.

l5 = accumulate([], operator.add, initial=10)
list(l5) # Output:[10]

Example 6: If iterable contains one element and the initial parameter is not mentioned, it will return that element.

l6 = accumulate([5],lambda x, y:x+y)
list(l6)) # Output:[5]

Example 7: Iterating through the iterator using for loop

Return type is an iterator. We can iterate through iterator using for loop also.

l2 = accumulate([5,6,7],lambda x,y:x+y)
for i in l2:
    print (i)
    '''
	Output:
	5
	11
	18
	'''

Differences between reduce() and accumulate()

Conclusion:
reduce() function is supported by the functools module.
accumulate() function is supported by the itertools module.
reduce() will return only an accumulated value.
accumulate() will return the running accumulated value. Like we can find running max, running product, running sum of an iterable using the accumulate() function.
accumulate() returns an iterator.

Thank you for reading my article, I hope you found it helpful!

出自： https://codeburst.io/reduce-vs-accumulate-in-python-3ecee4ee8094

chain()

itertools.chain(*iterables)

创建一个迭代器，它首先返回第一个可迭代对象中所有元素，接着返回下一个可迭代对象中所有元素，直到耗尽所有可迭代对象中的元素。可将多个序列处理为单个序列。大致相当于：

def chain(*iterables):
    # chain('ABC', 'DEF') --> A B C D E F
    for it in iterables:
        for element in it:
            yield element

1、去除 iterable 里的内嵌 iterable

from itertools import chain
a = [(1, 'a'), (2, 'b'), (3, 'c')]
b = [[1, 2], [3, 4], [5, 6]]
c = [{1, 2}, {3, 4}, {5, 6}]
for x in [a, b, c]:
	print(list(chain(*x)))
# [1, 'a', 2, 'b', 3, 'c']
# [1, 2, 3, 4, 5, 6]
# [1, 2, 3, 4, 5, 6]

2、两个序列的组合（类似 python 自带的 +）

from itertools import chain
a = [1, 2, 3]
b = [4, 5, 6]
print(a + b)
print(list(chain(*(a, b)))) # [1, 2, 3, 4, 5, 6]
c = ['a', 'b', 'c', 'd', 'e']
print(list(chain(*zip(a, c)))) # [1, 'a', 2, 'b', 3, 'c']

s = "abc"
t = "def"
print(s + t)
print(list(chain(*(s, t))))

2133. 检查是否每一行每一列都包含全部整数

Leetcode

class Solution:
    def checkValid(self, matrix: List[List[int]]) -> bool:
        n = len(matrix)
        x = set(range(1, n + 1))
        col = list(zip(*matrix)) # zip 转置列表
        for i in range(n): 
            if set(matrix[i]) != x: return False
            # if set(row[i] for row in matrix) != x: return False
            if set(col[i]) != x: return False
        
        return True

        # return all(set(range(1, len(matrix) + 1)) == set(row) for row in chain(matrix, zip(*matrix)))

74. 搜索二维矩阵

class Solution:
    def searchMatrix(self, matrix: List[List[int]], target: int) -> bool:
        # 二次二分
        l, r = 0, len(matrix) - 1        
        while l < r:
            mid = l + r + 1 >> 1 
            if matrix[mid][0] < target: l = mid 
            elif matrix[mid][0] > target: r = mid - 1
            else: return True

        mat = matrix[r]  # 记录在哪一行
        i = bisect_left(mat, target)
        return True if i < len(mat) and mat[i] == target else False

        # 二维转一维
        mat = list(chain(*matrix))
        i = bisect_left(mat, target) # 左
        return True if i < len(mat) and mat[i] == target else False

        i = bisect_right(mat, target) - 1 # 右 
        return False if mat[i] != target else True

classmethod chain.from_iterable(iterable)

构建类似 chain() 迭代器的另一个选择。从一个单独的可迭代参数中得到链式输入，该参数是延迟计算的。大致相当于：

def from_iterable(iterables):
    # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
    for it in iterables:
        for element in it:
            yield element

compress()

itertools.compress(data, selectors)

创建一个迭代器，它返回 data 中经 selectors 真值测试为 True 的元素。迭代器在两者较短的长度处停止。大致相当于：

def compress(data, selectors):
    # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
    return (d for d, s in zip(data, selectors) if s)

dropwhile()

filterfalse()

groupby()

islice()

pairwise()

itertools.pairwise(iterable)

返回从输入 iterable 中获取的连续重叠对。
输出迭代器中 2 元组的数量将比输入的数量少一个。如果输入可迭代对象中少于两个值则它将为空。大致相当于：

def pairwise(iterable):
    # pairwise('ABCDEFG') --> AB BC CD DE EF FG
    a, b = tee(iterable)
    next(b, None)
    return zip(a, b)

题目找到你的另一半

都说优秀的程序员擅长面向对象编程，但却经常找不到另一半，这是为什么呢？因为你总是把自己局限成为一个程序员，没有打开自己的思维。

这是一个社群的时代啊，在这里你应该找到与你有相同价值观但又互补的另一半。

譬如：你编程能力强，估值 11 分，如果以 20 分为最佳情侣来计算，你应该找一个设计能力强，估值为 9 分的女生。

那么当你遇到一个设计能力为 9 分的女生，千万别犹豫，大胆去表白。千万别以为后面的瓜比前面的甜哦。

例子：有一个能力数组 [7,9,11,13,15]，按照最佳组合值为 20 来计算，只有 7 + 13 和 9 + 11 两种组合。而 7 在数组的索引为 0，13 索引为 3，9 索引为 1，11 索引为 2。

所以函数：pairwise([7,9,11,13,15], 20) 的返回值应该是 0 + 3 + 1 + 2 的和，即 6。

要求

pairwise([1, 4, 2, 3, 0, 5], 7) 应该返回 11.
pairwise([1, 3, 2, 4], 4) 应该返回 1.
pairwise([1, 1, 1], 2) 应该返回 1.
pairwise([0, 0, 0, 0, 1, 1], 1) 应该返回 10.
pairwise([], 100) 应该返回 0.

代码

def pairwise(arr, arg):
    ans, n = 0, len(arr)
    for i in range(n):
        for j in range(i + 1, n):
            if arr[i] + arr[j] == arg:
                ans += i + j
                arr[j] = float("inf") # "OK";
                break
    return ans

pairwise([1,4,2,3,0,5], 7)

starmap()

takewhile()

tee()

zip_longest()

itertools.zip_longest(*iterables, fillvalue=None)

创建一个迭代器，从每个可迭代对象中收集元素。如果可迭代对象的长度未对齐，将根据 fillvalue 填充缺失值。迭代持续到耗光最长的可迭代对象。大致相当于：

def zip_longest(*args, fillvalue=None):
    # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
    iterators = [iter(it) for it in args]
    num_active = len(iterators)
    if not num_active:
        return
    while True:
        values = []
        for i, it in enumerate(iterators):
            try:
                value = next(it)
            except StopIteration:
                num_active -= 1
                if not num_active:
                    return
                iterators[i] = repeat(fillvalue)
                value = fillvalue
            values.append(value)
        yield tuple(values)