python itertools模块: count,cycle,repeat,chain,groupby,accumulate,product,permutations,combinations等

一 .

A.

简介:itertools模块常用内容:
无穷迭代器
count()  cycle()    repeat()
有限迭代器
chain()       groupby()        accumulate()
组合迭代器
product()          permutations()               combinations()      combinations_with_replacement(

B.

itertools,是python的一个内置模块,功能强大,主要用于高效循环创建迭代器。注意一点,他返回的不是list,而是iterator

无穷迭代器

count()

count(start, [step])会创建一个无限迭代器,从当前数字无限循环下去,停止可以按Ctrl+c键

import itertools

for i in itertools.count(9):
print(i)

cycle()

cycle( p)会把传入的序列无限的循环打印

for i in itertools.cycle('abcdf'):
print(i)

repeat()

repeat(elem [,n])可以把一个元素无限循环

for i in itertools.repeat('abcdef'):
print(i)

也可以添加第二个参数来限制个数

for i in itertools.repeat('a', 3):
print(i)

有限迭代器

因为有限迭代器有十多个,这里只列举几个常用的函数

chain()

chain(p,q…)可以把一组迭代对象串联起来,形成一个更大的迭代器

for i in itertools.chain('abc','efgh'):
print(i)

groupby()

groupby(iterable,key=None])把迭代器中相邻的元素按照key函数分组,当key=None时,把相邻的重复元素进行分组


for i in itertools.groupby('aaaaaaaaaassssssdddddddddssssswww'):
print(i)

accumulate()

accumulate(iterable [,func])是一个计算迭代器,如果不指定参数函数,将会采取默认的求和函数

for i in itertools.accumulate([1,2,0,4,5,3,2,8]):
print(i)

for i in itertools.accumulate([1,2,0,4,5,3,2,8], max):
print(i)

组合迭代器
组合操作在算法中经常会用到,因此这里就体现出来itertools模块的便捷性
下面来看看四个组合迭代器的用法

product()
product(p,q,…[repeat=1])得到的是可迭代的笛卡尔积,多个可迭代对象的笛卡尔积可以通过for循环来实现,比如product(a,b)用for循环可以表示为

((i,j) for i in a for j in b)

# 笛卡尔积
for i in itertools.product('abc','def'):
print(i)

permutations()

permutations(p[, r])返回的是一个长度为 r 的所有可能排列,无重复元素

for i in itertools.permutations('wer',3):
print(i)

第二个参数默认为迭代序列长度

combinations()

combinations(p, r)返回的是一个长度为r的组合,它是有序的,无重复元素
for i in itertools.combinations('werf',3):
print(i)

combinations_with_replacement()

combinations_with_replacement(p, r)返回的是一个长度为r的组合,它是有序的,元素可以重复

for i in itertools.combinations_with_replacement('werf',3):
print(i)

第一部分结束。

------------------------------------------------

1. 简介  

python itertools是一个非常强大而有用的工具,这里简要汇总介绍其中各种工具并提供简单应用示例,以备实际使用时可以快速查询。使用时需要先导入itertools模块,有以下几种导入方法。

import itertools
import itertools as it
from itertools import permutation, product


2. Infinite iterators


2.1 count(start=0, step=1)
Make an iterator that returns evenly spaced values starting with number start. Often used as an argument to map() to generate consecutive data points. Also, used with zip() to add sequence numbers. 

创建一个能生成均匀分布序列(或者说等差数列)的迭代器,start指定起点,step指定步长。start, step不限于浮点数。注意,count()生成的是无限序列迭代器,所以用于循环中时,需要另外的条件来控制循环的终止。

print('count example 1 ...')
k = 0
for item in it.count(10.7,1.1):
    k += 1
    if k == 10 :

        break    
        print('k={0}, item={1})'.format(k,item))  
 
count example 1 ...
k=1, item=10.7)
k=2, item=11.799999999999999)
......
k=6, item=16.2)
......
k=9, item=19.500000000000004)
When counting with floating point numbers, better accuracy can sometimes be achieved by substituting multiplicative code such as: (start + step * i for i in count()).

如以上例子所示,如果step为浮点数时,改用(start + step * i for i in count())可能能获得更好的精度。如下例所示。

print('count example 2 ...')
for k in it.count(0):
    item = 10.7 + 1.1 * k
    if k == 10 : 
        break    
    print('k={0}, item={1})'.format(k,item))  


2.2 cycle(iterable)
Make an iterator returning elements from the iterable and saving a copy of each. When the iterable is exhausted, return elements from the saved copy. Repeats indefinitely. 

创建一个迭代器,从一个iterable中循环取出元素并输出。以下例子会输出比如:ABCDEABCDE...

# cycle(iterable)
print('\ncycle example 1 ...')
k = 0
for item in it.cycle('ABCDE'):
    print('k={0}, item={1}'.format(k,item))      
    k += 1
    if k == 10 : break    
Note, this member of the toolkit may require significant auxiliary storage (depending on the length of the iterable).

需要注意的是,该迭代器会将从输入iterable中取出的元素存储一份copy,在第一遍遍历完iterable后,是从copy中循环取出元素并输出。所以需要一份额外的存储空间。

2.3 repeat(object[, times])
Make an iterator that returns object over and over again. Runs indefinitely unless the times argument is specified. Used as argument to map() for invariant parameters to the called function. Also used with zip() to create an invariant part of a tuple record.

创建迭代器,重复输出object。如果未指定times会无限输出,如果指定了times则重复输出指定次数。

# repeat(object, times)
print('\nrepeat example 1 ...')
print('Important things are worth of three times of repeat!')
for item in it.repeat('Do not answer!',3):
    print('{0}'.format(item)) 
 A common use for repeat is to supply a stream of constant values to map or zip. 一个典型的用法是用于给map或者zip提供一个常数值串,如下所示:

print('\nrepeat example 2 ...')
print(list(map(pow, range(10), it.repeat(2))))


3. Iterators terminating on the shortest input sequence


3.1 accumulate(iterable[, func, *, initial=None])
Make an iterator that returns accumulated sums, or accumulated results of other binary functions (specified via the optional func argument).

生成能够返回输入iterable参数的部分和的迭代器。

print(list(it.accumulate([1,2,3,4,5])))
以上语句将输出[1,3,6,10,15],即原输入[1,2,3,4,5]的滚动和(running sum). 利用这个功能可以很方便地生成级数.

缺省情况下是进行求和运算,但是并不仅限于此,用func可以指定想执行的运算(当然必须是二元的数值运算符),比如说,如果想要求running product的话,如下例所示:

accumulate([1,2,3,4,5], operator.mul)
以上语句将输出[1,2,6,24,120]。利用这个功能可以很方便地生成级数.

 缺省情况下如上所示,输出的个数等于输入的个数。但是如果通过参数initial指定初始值的话,第一个输出是由initial指定,然后才是running result,因此输出会比输入多出一个来。如下例所示:

print(list(it.accumulate([1,2,3,4,5],initial=10)))
以上语句将返回:[10, 11, 13, 16, 20, 25]。第一个值为initial,其后才是以initial为基础的running sum.

3.2 chain(*iterables)
Make an iterator that returns elements from the first iterable until it is exhausted, then proceeds to the next iterable, until all of the iterables are exhausted. Used for treating consecutive sequences as a single sequence. 

创建一个迭代器,接收多个iterables参数,将它们串接起来,然后进行遍历。比如:

# chain example
print(list(it.chain('ABCD','EFG','HIJ')))
输出: ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J']

注意,和上面几个例子以上,这里采用了用list(iterator)的方式将iterator转换成list。

3.3 classmethod chain.from_iterable(iterable)
3.4 compress(data, selectors)
Make an iterator that filters elements from data returning only those that have a corresponding element in selectors that evaluates to True. Stops when either the data or selectors iterables has been exhausted. Roughly equivalent to:

根据selectors所列的各元素对原数据进行滤波处理,挑选出对应的selectors元素为True的data元素。selectors与data的元素是一一对应的。

import itertools as it
print(list(it.compress('ABCDEFG', [0,0,1,0,0,1,1])))
Output:  ['C', 'F', 'G']. 只有selector参数中1所对应的data的元素才被选中输出。

如果两者的元素个数不同时,则依据较短(元素个数较少)者进行遍历。如以下所示:

print(list(it.compress('ABCDEFG', [0,0,1,0,0,1]))) # ['C', 'F']
 
print(list(it.compress('BCDEFG', [0,0,1,0,0,1,1]))) # ['D', 'G']
  

dropwhile()

filterfalse()

groupby()

3.8 islice()
islice(iterable, stop)

islice(iterable, start, stop[, step])

Make an iterator that returns selected elements from the iterable. If start is non-zero, then elements from the iterable are skipped until start is reached. Afterward, elements are returned consecutively unless step is set higher than one which results in items being skipped. If stop is None, then iteration continues until the iterator is exhausted, if at all; otherwise, it stops at the specified position. Unlike regular slicing, islice() does not support negative values for start, stop, or step. Can be used to extract related fields from data where the internal structure has been flattened (for example, a multi-line report may list a name field on every third line). 

以slicing的方式从一个可迭代对象(iterable)种取元素并输出。可以把islice()理解为升级版的range(),range()就是islice的第1个iterable参数设为一个连续整数序列时的行为相同。

第一种调用形式只指定一个参数,这个参数是stop,相当于取iterable[:stop]然后一个一个输出。

# islice
print('\islice example1:')
A = [k for k in range(10)]
for item in it.islice(A, 3):
    print(item)  
输出为:

        islice example1:
        0
        1
        2

第二种调用形式只指定两个或三个参数,第2个是start,第3个是stop,第4个step可选,如果不设置step的话,缺省为1。

print('\nislice example2:')
A = [k for k in range(10)]
for item in it.islice(A, 2,8,2):
    print(item)  
输出为:

islice example2:
2
4
6

3.9 starmap()
3.10 takewhile()
3.11 tee()

3.12 zip_longest(*iterables, fillvalue=None)
Make an iterator that aggregates elements from each of the iterables. If the iterables are of uneven length, missing values are filled-in with fillvalue. Iteration continues until the longest iterable is exhausted. 

If one of the iterables is potentially infinite, then the zip_longest() function should be wrapped with something that limits the number of calls (for example islice() or takewhile()). If not specified, fillvalue defaults to None.

        itertools.zip_longest可以看作是python内置的zip()的功能补充工具。zip在处理多个iterables的并行迭代时迭代次数是以其中长度最小的为准。但是在有些应用场合需要以其中长度最长的为准,itertools.zip_longest即是以解决这个问题而生。

import itertools as it
fruits = ['apple', 'banana', 'melon', 'strawberry']
prices = [10, 20, 30]
print(list(it.zip_longest(fruits, prices)))
        运行结果: [('apple', 10), ('banana', 20), ('melon', 30), ('strawberry', None)]

        由于prices的元素比fruits的少一个,所以python自动为fruits的最后一个元素'strawberry'配了一个None。当然如果你希望这里能给出更有参考意义的信息,那么可以通过可选参数fillvalue来指定,如下例所示:

import itertools as it
fruits = ['apple', 'banana', 'melon', 'strawberry']
prices = [10, 20, 30]
print(list(it.zip_longest(fruits, prices, fillvalue='Sold out')))
        运行结果: [('apple', 10), ('banana', 20), ('melon', 30), ('strawberry', 'Sold out')]

        这样为'strawberry'配上了“Sold out(售罄,卖光了)”就显得自然多了。

        关于zip与zip_longest的差异还可以参考: Python zip, unzip, zip_longest的用法

4. Combinatoric iterators
4.1 product(*iterables, repeat=1)
Cartesian product of input iterables. 生成输入各iterable(分别看作一个集合)的笛卡尔积.

Roughly equivalent to nested for-loops in a generator expression. For  example, product(A, B) returns the same as ((x,y) for x in A for y in B).

大致等价于生成器表达式中的嵌套循环。最右侧的iterable处于最内层循环,最左侧的iterable处于最外层循环。打个比方说,如果输入有三个iterable,product(A,B,C),则C可以看作是秒针,B可以看作是分针,A可以看作是时针。或者C是里程计的最右侧读数,而A是最左侧读数。

如果要计算一个iterable自身的笛卡尔积,可以通过repeat指定重复次数。比如说,product('ABC',repeat=4)等价于product('ABC','ABC','ABC','ABC').

# product(*iterables, repeat=1)
print('\nproduct example 1 ...')
for item in it.product('AB','CD'):
    print(item)
 
print('\nproduct example 2 ...')
for item in it.product('AB', repeat=4):
    print(item)
Before product() runs, it completely consumes the input iterables, keeping pools of values in memory to generate the products. Accordingly, it is only useful with finite inputs.

product()是先把输入iteables全部保存到memory用于随后的生成。所以product()不能接收无限长的iterable作为参数。

4.2 permutations(iterable, r=None)
Return successive r length permutations of elements in the iterable.

If r is not specified or is None, then r defaults to the length of the iterable and all possible full-length permutations are generated.

The permutation tuples are emitted in lexicographic ordering according to the order of the input iterable. So, if the input iterable is sorted, the combination tuples will be produced in sorted order.

Elements are treated as unique based on their position, not on their value. So if the input elements are unique, there will be no repeat values in each permutation.

返回p中任意取r个元素做排列的元组的迭代器. 注意,参数列表中的[,r]表示第2个参数是可选项。如果不设置的话,就缺省地取第一个参数的长度,此时返回的结果为全排列。

import itertools as it
 
# for p in it.permutations(['a','b','c'],3):
for p in it.permutations(['a','b','c']): 
    # Has the same behaviour as the above statement    
    print(p, end=', ')
 
print('')
for p in it.permutations(['a','b','c'],2 ): 
    # Return any permutations of length 2
    print(p, end=', ')
运行结果:

        ('a', 'b', 'c'), ('a', 'c', 'b'), ('b', 'a', 'c'), ('b', 'c', 'a'), ('c', 'a', 'b'), ('c', 'b', 'a'), 
        ('a', 'b'), ('a', 'c'), ('b', 'a'), ('b', 'c'), ('c', 'a'), ('c', 'b'), 

令输入iterable长度为n, 如果指定参数r<=n,则返回的排列数为;如果r>n返回0项。如果不指定r,则返回项。

4.3 combinations(iterable, r)
Return r length subsequences of elements from the input iterable.

The combination tuples are emitted in lexicographic ordering according to the order of the input iterable. So, if the input iterable is sorted, the combination tuples will be produced in sorted order.

Elements are treated as unique based on their position, not on their value. So if the input elements are unique, there will be no repeat values in each combination.

返回输入iterable中元素的长度为r的组合。令输入iterable元素个数为n,则返回的组合数为

# combinations
print('\ncombinations example1:')
for item in it.combinations(['A', 'B', 'C', 'D'], 2):
    print(item)
4.4 combinations_with_replacement(iterable, r)
Return r length subsequences of elements from the input iterable allowing individual elements to be repeated more than once.

The combination tuples are emitted in lexicographic ordering according to the order of the input iterable. So, if the input iterable is sorted, the combination tuples will be produced in sorted order.

Elements are treated as unique based on their position, not on their value. So if the input elements are unique, the generated combinations will also be unique.

与combinations()的区别在于允许同一元素重复取用。

# combinations_with_replacement
print('\combinations_with_replacement example1:')
k = 0
for item in it.combinations_with_replacement(['A', 'B', 'C', 'D'], 3):
    print(item)
    k += 1
print('Totally there are {0} combinations'.format(k))    
n个不同元素的允许重复取用的r-组合数的闭式表达式是什么呢?值得思考一下。

5. itertools.pairwise() 
        pairwise()是从python3.10引入的一个方法。

        它用于按顺序返回一个iterable中每两个相邻的item构成的tuple的列表。注意,不是任意items两两成对,只有相邻的两个成对,而且在tuple中的排序与在原iterable中的顺序相同。

        

from itertools import pairwise
 
lst = [1,2,3,4,5]
print("Successive overlapping pairs - ", list(pairwise(lst)))
 
string = "hello educative"
print("Successive overlapping pairs of characters in a string- ", list(pairwise(string)))
        第一个例子返回的是: 

[(1, 2), (2, 3), (3, 4), (4, 5)]

        第二个例子返回的是:

 [('h', 'e'), ('e', 'l'), ('l', 'l'), ('l', 'o'), ('o', ' '), (' ', 'e'), ('e', 'd'), ('d', 'u'), ('u', 'c'), ('c', 'a'), ('a', 't'), ('t', 'i'), ('i', 'v'), ('v', 'e')]

参考内容网址 :

itertools — Functions creating iterators for efficient looping — Python 3.12.1 documentation
python itertools模块详解-CSDN博客https://blog.csdn.net/chenxy_bwave/article/details/120110095

python itertools详解及使用示例-CSDN博客

  • 21
    点赞
  • 23
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值