15.在闭包里使用外围作用域中的变量
- 问题描述:比如特殊的排序,给一个list中的数字排序,要求把出现在另一个list中的数字排前面。
def sort_priority(values, group):
def helper(x):
if x in group:
return (0, x)
return (1, x)
values.sort(key=helper)
"""test"""
numbers = [8, 3, 1, 2, 5, 4, 7, 6]
group = {2, 1, 5, 7}
sort_priority(numbers, group)
print(numbers)
[1, 2, 5, 7, 3, 4, 6, 8]
函数原理:
- 1)闭包(closure):闭包是一种函数,该函数定义在某个作用域中,可以引用这个作用域里面的变量。
也是为什么heper函数能访问sort_priority的group参数的原因。 - 2)python函数是一级对象(first-class object),即函数可以当作变量使用。
所以helper闭包函数传给了sort方法的key参数。 - 3)python比较两个元组(或list)的规则是:首先比较各元组中下标为0的对应元素,如果相等,再比较下标为1的对应元素。
获取闭包内的数据
- 使用nonlocal语句:给相关变量赋值的时候,应该再上层作用域中查找该变量。限制:不能延伸到模块级别。
def sort_priority3(numbers, group):
found = False
def helper(x):
nonlocal found
if x in group:
found = True
return (0, x)
return (1, x)
numbers.sort(key=helper)
return found
"""test"""
found = sort_priority3(numbers, group)
print('Found:',found)
print(numbers)
Found: True
[1, 2, 5, 7, 3, 4, 6, 8]
"""同样的作用,使用辅助类(helper class)"""
class Sorter(object):
def __init__(self, group):
self.group = group
self.found = False
def __call__(self, x):
if x in self.group:
self.found = True
return (0, x)
return (1, x)
sorter = Sorter(group)
numbers.sort(key=sorter)
assert sorter.found is True
16.用生成器改写直接返回list的函数
- 问题描述:查出字符串中每个词的首字母在整个字符串中的位置。
def index_words(text):
result = []
if text:
result.append(0)
for index, letter in enumerate(text):
if letter == ' ':
result.append(index + 1)
return result
"""test"""
address = 'Four score and seven years ago...'
result = index_words(address)
print(result)
[0, 5, 11, 15, 21, 27]
"""改用生成器(yield)"""
def index_word_iter(text):
if text:
yield 0
for index, letter in enumerate(text):
if letter == ' ':
yield index + 1
"""调用该生成器后返回的迭代器,传给内置的list函数"""
result = list(index_word_iter(address))
print(result)
[0, 5, 11, 15, 21, 27]
17.在参数上面迭代
- 问题描述:数据集是由每个城市的游客数量构成的list,计算每个城市旅游人数占总人数的比例。
def normalize(numbers):
total = sum(numbers)
result = []
for value in numbers:
percent = 100 * value / total
result.append(percent)
return result
"""test"""
visits = [15, 35, 80]
percentages = normalize(visits)
print(percentages)
[11.538461538461538, 26.923076923076923, 61.53846153846154]
"""上面函数的list处理数据过大可能会崩,改用yield生成器"""
def read_visits(data_path):
with open(data_path) as f:
for line in f:
yield int(line)
"""test"""
it = read_visits('file.txt'):
percentages = normalize(it)
print(percentages)
"""输出是:[]。原因在于迭代器只能产生一轮结果,在normalize里的sum过程中就用完了迭代器。"""
#举个例子:
it = read_visits('file.txt')
print(list(it))
print(list(it))
- [15,35,80]
- []
"""解决办法:使用迭代器制作一个list,使用这个list进行迭代"""
def normalize_copy(numbers):
numbers = list(numbers) #copy the iterator
total = sum(numbers)
result = []
for value in numbers:
percent = value * 100 / total
result.append(percent)
return result
"""上面的缺点:复制迭代器大量数据导致内存耗尽"""
"""则:通过参数来接受另外一个函数,这个函数每次调用都返回新的迭代器"""
def normalize_func(get_iter):
total = sum(get_iter()) # New iterator
result = []
for value in get_ier():
percent = 100 * value / total # New iterator
result.append(percent)
return result
"""使用函数,则每次都是使用新的生成器"""
percentages = normalize_func(lambda: read_visits(path))
"""最好的方法:迭代器协议(iterator protocol)"""
1)python执行类似for x in foo这样的语句时,python实际会调用iter(foo); 2)内置的iter函数又会调用foo.__iter__这个特殊方法; 3)该方法必须返回迭代器对象,这个迭代器本身实现了__next__特殊方法; 4)因此,for循环会在迭代器对象上面反复调用内置的next函数,直至耗尽并产生StopIteration异常。
"""用类把__iter__方法实现为生成器"""
class ReadVisits(object):
def __init__(self, data_path):
self.data_path = data_path
def __iter__(self):
for line in self.data_path:
yield int(line)
visits = ReadVisits(visits)
percentages = normalize(visits)
print(percentages)
"""normalize中的sum方法会调用__ReadVisits.__iter__得到新的迭代器,for循环也会调用__iter__得到
另外一个新的迭代器。"""
[11.538461538461538, 26.923076923076923, 61.53846153846154]
>
* 迭代器协议的约定:
* 迭代器对象传给内置的iter函数,此函数会把该迭代器返回;
* 传给iter函数的是个容器类型的对象,那么iter函数每次都会返回新的迭代器对象。
"""确保调用者传进来的参数不是迭代器本身。"""
def normalize_defensive(numbers):
if iter(numbers) is iter(numbers): # An iterator --bad!
raise TypeError('Must supply a container')
total = sum(numbers)
result = []
for value in numbers:
percent = 100 * value / total
result.append(percent)
return result
"""test"""
visits = [15, 35, 80]
normalize_defensive(visits)
print('No Error')
it = iter(visits)
normalize_defensive(it)
No Error
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-8-96c56045e9b0> in <module>()
14 print('No Error')
15 it = iter(visits)
---> 16 normalize_defensive(it)
<ipython-input-8-96c56045e9b0> in normalize_defensive(numbers)
2 def normalize_defensive(numbers):
3 if iter(numbers) is iter(numbers): # An iterator --bad!
----> 4 raise TypeError('Must supply a container')
5 total = sum(numbers)
6 result = []
TypeError: Must supply a container