Python Generators are often considered a somewhat advanced topic, but they are actually very easy to understand once you start using them on a regular basis. Actually, after you use generators for some time, you will often find them more readable and performant than other options.
In this video, we will look at what a python generator is, how and why we would use one, and the performance benefits they give us.
list and yield
对一个list中的每个元素均做平方处理,有以下三种方式。
- 传统方法:利用函数的返回值
def square_number(nums):
result=[]
for it in nums:
result.append(it*it)
return result
my_nums =square_number([1,2,3,4,5])
print(my_nums)
[1, 4, 9, 16, 25]
- generator方法:利用yield关键字
def square_number(nums):
for it in nums:
yield(it*it) # more readable
my_nums =square_number([1,2,3,4,5]) # 产生一个generator对象而不是返回一个list
print(my_nums) # generator 的 地址
print(next(my_nums))
print(next(my_nums))
print(next(my_nums))
print(next(my_nums))
print(next(my_nums))
<generator object square_number at 0x7f7ae8236fc0> # 产生一个generator对象而不是返回一个list
1
4
9
16
25
也可以利用for循环遍历generator对象,获得类似next(my_nums)的效果,且当被循环遍历的输入元素结束时,for循环自动停止,不会出现遍历超出范围的error(采用**next()**方法会出现遍历超出范围的error)
my_nums =square_number([1,2,3,4,5])
for num in my_nums:
print(num) # for loop 知道循环多少次之后自动停止
1
4
9
16
25
- 另外一种方式
my_nums = [x*x for x in [1,2,3,4,5]]
for num in my_nums:
print(num)
1
4
9
16
25
- 通过list()将generator转化成一般的list
如果通过list()函数进行转化,则会失去generator占用内存小的优势。
my_nums =square_number([1,2,3,4,5])
print(list(my_nums))
性能分析——使用yield的优势
使用generator的好处在于:
- more readable:代码更简单,可读性更强
- 占用内存小,通过next函数,一次只计算出一个数据,而如果通过list函数,将generator强行转化成list 则会失去这种好处。
import mem_profile
import random
import time
names = ['John', 'Corey', 'Adam', 'Steve', 'Rick', 'Thomas']
majors = ['Math', 'Engineering', 'CompSci', 'Arts', 'Business']
print 'Memory (Before): {}Mb'.format(mem_profile.memory_usage_psutil())
def people_list(num_people):
result = []
for i in xrange(num_people):
person = {
'id': i,
'name': random.choice(names),
'major': random.choice(majors)
}
result.append(person)
return result
def people_generator(num_people):
for i in xrange(num_people):
person = {
'id': i,
'name': random.choice(names),
'major': random.choice(majors)
}
yield person
# t1 = time.clock()
# people = people_list(1000000)
# t2 = time.clock()
t1 = time.clock()
people = people_generator(1000000)
t2 = time.clock()
print 'Memory (After) : {}Mb'.format(mem_profile.memory_usage_psutil())
print 'Took {} Seconds'.format(t2-t1)
引用
本文主要参考下列视频内容,翻译并亲测代码后形成此文,感谢视频作者的无私奉献!