1.生成器
生成器是一个对象,每次调用它的时候,都会调用next()方法返回一个值,直到抛出StopIteration异常;
一般生成器对象由两种:一种是对象本省就是生成器,另外一种即使包含yield语句的函数,可以简单理解为生成器;yield语句有两层含义:和return一样返回一个值,同时会记录解释器对栈的引用,在下次调用到来时,从上次yield执行的状态开始接着往下执行;
下面就是一个简单的生成器函数:
def mygenerator():
yield 1
yield 2
yield 3
yield 4
print mygenerator()
g=mygenerator()
print next(g)
print next(g)
print next(g)
print next(g)
检测函数是否为生成器函数,可以使用inspect模块中的方法实现
import inspect
inspect.isgeneratorfunction(mygenerator)
inspect.isgenerator(mygenerator())
inspect.isgeneratorfunction的源码如下:
def isgeneratorfunction(object):
"""Return true if the object is a user-defined generator function.
Generator function objects provides same attributes as functions.
See help(isfunction) for attributes listing."""
return bool((isfunction(object) or ismethod(object)) and
object.func_code.co_flags & CO_GENERATOR)
在python3中,有inspect.getgeneratorstate函数可以获取生成器的执行的状态,状态有:GEN_CREATED、GEN_SUSPENDED、GEN_CLOSED等;
>>> def mygenerator():
... yield 1
>>> get = mygenerator()
>>> get
<generator object mygenerator at 0x7ff49fdfe4c0>
>>> inspect.getgeneratorstate(get)
'GEN_CREATED'
>>> next(get)
1
>>> inspect.getgeneratorstate(get)
'GEN_SUSPENDED'
>>> next(get)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>>
>>> inspect.getgeneratorstate(get)
'GEN_CLOSED'
生成器可以有效的处理即时生成的大量消耗内存的数据,因为处理这类数据的时候,就会在内存中加载全部的数据,非常消耗内存,而生成器可以让数据只有在被循环处理到的时候,才会在内存中创建数据;
这里我们将python的运行内存限制在128MB
[root@linux-node1 ~]# python
Python 2.7.5 (default, Nov 6 2016, 00:28:07)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> a = list(range(10000000))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
MemoryError #内存溢出
#使用生成器
>>> for value in xrange(10000000):
... if value == 50000:
... print ("fount it")
... break
...
fount it
yield有一个send()函数,通过生成器,可以向生成器函数传入参数,下面的例子,可以在单个线程中实现并发的效果
#!/usr/bin/env python
# _*_ coding:utf-8 _*_
__author__ = 'Charles Chang'
def mygenerator():
yield 1
yield 2
yield 3
yield 4
print mygenerator()
g=mygenerator()
print next(g)
print next(g)
print next(g)
print next(g)
import inspect
print inspect.isgeneratorfunction(mygenerator)
print inspect.isgenerator(mygenerator())
#生成者生产包子,两个消费者吃包子
import time
def consumer(name):
print "\033[32m;32%s ready to eat baozi\033[0m" %name
while True:
baozi = yield
print("\033[31m baozi [%s] is coming,eaten by [%s]!\033[0m" %(baozi,name))
li=[]
def producer(name):
c = consumer('A') #c和c2都是生成器
c2 = consumer('B')
c.next()
c2.next()
print("\033[31m begin to eat baozi\033[0m")
while True:
time.sleep(1)
print("two baozi have been done")
c.send("delicious") "delicious"是向consumer传入的值,赋值给baozi
c2.send("decilious")
producer("haha")
结果:
;32A ready to eat baozi
;32B ready to eat baozi
begin to eat baozi
two baozi have been done
baozi [delicious] is coming,eaten by [A]!
baozi [decilious] is coming,eaten by [B]!
two baozi have been done
baozi [delicious] is coming,eaten by [A]!
baozi [decilious] is coming,eaten by [B]!
生成器表达式
(x.upper for x in ['hello','world']) #生成器
[x.upper for x in ['hello','world']] #列表
2.列表解析
同时使用多条for和if实现过滤
x = [word.capitalize()
for line in ("hello world?","world!","or not")
for word in line.split()
if not word.startswith("or")]
print x
结果:
['Hello', 'World?', 'World!', 'Not']
3、map、filter
python2上述方法返回的结果为列表,python3返回的是可以迭代的对象;
如果想要返回一个可以被迭代的对象,就需要使用itertools模块中的方法,itertools.ifilter、itertools.imap;