1. Overview of Built-In Sequence
the standard lib offers a rich selection of sequence types implemented in C:
1.1容器序列 vs 扁平序列
- 容器序列 Container sequences:
- list, tuple, collections.queue (hold items of different types)
- hold references to the objects 装的是对象的地址
- 扁平序列 Flat sequences:
- str, bytes, bytearray, memoryview, array.array (hold items of one type)
- physically store the value of each item within its own memory space 装的是元素本人
- more compact
1.2 mutable vs immutable
- mutable sequence
- list, bytearray, array.array, collections.queue, memoryview
- immutable sequence
- tuple, str, bytes
mutable是immutable的子类
immutable(superclasses) <=== mutable(subclasses)
2. List Comprehensions and Generator Expression
list comprehension --> listcomp
generator expression --> genexps
listcomp构建list,genexps构建其他任何类型的序列
2.1 List Comprehensions and Readability
A listcomp is meant to do one thing only: to build a new list
在python2.x的时候,如果listcomp里有和外面一样的variable的话,variable会因为listcomp被更新。但是在python3.x中,就算listcom里有和外面一样的variable,外面的variable也还是外面的variable,不会受到listcomp里面的操作所影响
2.2 Listcomps vs map and filter
listcomps do everything the map
and filter
functions do, without the contortions of the functionally changed Python lambda
listcomp 不比
map
和filter
慢哦
2.3 Cartesian Products 笛卡尔积
listcomp can generate lists from the Cartesian product of two or more iterables.
2.4 Generator Expressions 生成器表达式
To fille up other sequence types, a genexp is the way to go.
genexp – building nonlist sequences
genexp saves memory because it yields items one by one using the iterator protocol instead of building a whole list to feed another constructor.
Genexp use the same syntax as listcomps, but are enclosed in parentheses ()
rather than brackets []
generator expression 返回的是一个 generator object
array
constructor takes two arguments
- argument 1 : the storage type
- argument 2: numerical values
code | type | python type | min bytes |
---|---|---|---|
‘b’ | signed char | int | 1 |
‘B’ | unsigned char | int | 1 |
‘u’ | Py_UNICODE | Unicode | 2 |
‘h’ | signed short | int | 2 |
‘H’ | unsigned short | int | 2 |
‘i’ | signed int | int | 2 |
‘I’ | unsigned int | int | 2 |
‘l’ | signed long | int | 4 |
‘L’ | unsigned long | int | 4 |
‘f’ | float | float | 4 |
‘d’ | double | double | 8 |
因为generator expressions是一个一个蹦val出来的,所以如果在做笛卡尔积的时候(e.g.有1000000个积)用generator expression + for loop可以帮我们省掉build一整个list的空间
3. Tuples Are Not Just Immutable Lists
tuples do double duty:
- can be used as immutable lists
- can be used as records with no field names
3.1 Tuples as Records
each item in the tuple holds the data for one filed and the position of the item gives its meaning.
when using a tuple as a collection of fields, the number of items is often fixed and their order is always vital.
3.2 Tuple Unpacking 元组拆包
元组拆包有两种主要用途:
-
在不借助第三个变量的情况下,将两个变量的值互相交换
a, b = b, a
an elegant application of tuple unpacking is swapping the values of variables without using a temp variablea, b = b, a
-
用
*tuple_name
*运算符把一个可迭代对象拆开作为函数的参数
another example of tuple unpacking is prefixing an argument with a star when calling a function
enable functions to return multiple values in a way that is convenient to the caller
os.path.split()
function builds a tuple(path, last_part)
from a filesystem path
用*来处理剩下的元素
函数用*args
来获取不确定数量的参数
defining function parameters with *args
to grab arbitrary excess argument is a classic Python feature
平行赋值 parallel assignment
在平行赋值中,*
前缀只能用在一个变量名前面,但这个变量可以在赋值表达式的任意位置。
*prefix
can be applied to exactly one variable, but it can appear in any position
3.3 Nested Tuple Unpacking 嵌套元组拆包
3.4 Named Tuples 具名元组
collections.namedtuple
可以用来构建一个带字段名的元组和一个有名字的类
collections.namedtuple
function is a factory that produces subclasses of tuple
enhanced with field names and a class name – which helps debugging.
from collections import namedtuple City = namedtuple('City', 'name country population coordinates') tokyo = City('Tokyo', 'JP', 36.933, (35.68922, 139.691667)) #如果元素数量跟上面定义的不一致会报错 print(tokyo) # City(name='Tokyo', country='JP', population=36.933, coordinates=(35.68922, 139.691667)) print(tokyo.population) # 36.933 print(tokyo.coordinates) # (35.68922, 139.691667)
nametyple()接收两个参数
- 类名
- 类的各个字段的名字
- 可以是 数个字符串组成的可迭代对象
- 可以是由空格分隔开的字段名组成的字符串
attributes and methods of namedtuple
from collections import namedtuple City = namedtuple('City', 'name country population coordinates') tokyo = City('Tokyo', 'JP', 36.933, (35.68922, 139.691667)) # 一个包含这个类所有字段名称的元组 print(City._fields) Latlong = namedtuple('LatLong', 'lat long') delhi_data = ('Delhi NCR', 'IN', 21.935, Latlong(28.613889, 77.208889)) # _make() 通过接收一个可迭代对象来生成这个类的一个instance # 等同于 City(*delhi_data) delhi = City._make(delhi_data) # instance._asdict() 把namedtuple以dict的形式返回 print(delhi._asdict()) for k, v in delhi._asdict().items(): print(k + ':', v)
_fields
attribute : a tuple with the field names of the class_make()
method : allows to instantiate a named tuple form an iterableCity(*delhi_data)
would do the same_asdict()
method : returns a collections.OrderedDict built from the named tuple instance.
3.5 Tuples as Immutable Lists 作为不可变列表的元组
从下表list和tuple的对比来看,除了增减元素相关的method,tuple几乎支持list的其他所有method。
When using a tuple as an immutable variation of list, it helps to know how similar they actually are.
4. 切片 Slicing
a common feature of all sequence types in Python
4.1 为什么切片和区间会忽略最后一个元组 why slices and range exclude the last item
有以下好处:
- 当只有最后一个位置信息时,可以快速看出切片和区间里有几个元素
it’s easy to see the length of a slice or range when only the stop position is givenrange(3)
,arr[:3]
both produce 3 items 都返回3个元素
- 当起止位置信息都可见时,可以快速计算出切片和区间的长度,用最后一个数减去第一个下标即可
it’s easy to compute the length of a slice or range when start and stop- just subtract
stop - start
- just subtract
- 可以利用任意一个下标来把sequence分割成不重叠的两部分
it’s easy to split a sequence in 2 parts at any indexx
, without overlappingarr[:x]
andarr[x:]
4.2 对对象切片 slice objects
s[a:b:c]
:= seq[start:end:step]
- s – list 本人
- a – start point
- b – end point (exclude)
- c – step 步长
To evaluate the expression seq[start:end:step]
, Python calls seq.__getitem__(slice(start, stop, step))
可以把slice
定义给一个variable,然后call这个variable在[]
里
multidimensional slicing and ellipsis
The []
operator can also take multiple indexes or slices separated by commas.
The __getitem__
and __setitem__
special methods that handle the []
operator simply receive the indices in a[i, j]
as a tuple. i.e. , to evaluate a[i, j]
, Python calls a.__getitem__((i, j))
assigning to slices
mutable sequences caN be grafted, excised, and otherwise modified in place using slice notation on the left side of an assignment statement or as the target of a del
statement
when the target of the assignment is a slice, the right side must be an iterable object, even If it has just one item
Using + and * with Sequence
both +
and *
always create a new object, and never change their operands
beware of num*sequence containing mutable items
Building Lists of Lists
initialize a list with a certain number of nested lists
如果是用 * 的话,要注意create的指向同一个object的reference
Augmented Assignment with Sequences
the method that makes +=
work is __iadd__
(for “in-palce addition”)
the method that makes *=
work is __imul__
(“in-place multi”)
arr *= num --> object 本人更新
tuple *= num --> reference 指向另一个object(新tuple)
repeated concatenation of immutable sequence is inefficient, because instead of just appending new items, the interpreter has to copy the whole target sequence to create a new one wIth the new items concatenated.
例外:
str
- str instances are allocaTed in memory with room to spare, so that concatenation does not require copying the whole string every time
A += Assignment Puzzler
t = (1, 2, [30, 40] t[2] += [50, 60]
结果竟然
wowowowowow
第一个typeError是正常- tuple immutable 报错
但是 tuple[2]里起始存的是个reference指向了mutable的list
Lessons learned:
- putting mutable items in tuples is not a good idea
- augmented assignment is not an atomic operation – we just saw it throwing an exception after doing part of its job
- inspecting Python bytecode is not too difficult, and is often helpful to see what is going on under the hood.
inspect Python bytecode using dis.dis
list.sort
and the sorted
Built-In Function
list.sort
method sorts a list in place -
- without making a copy
- returns
None
to remind us that it changes the target object
built-in function sorted
- creates a new list & return it
- it accepts any iterable object as an argument, including immutable sequence & generators
2 optional, keyword-only arguments:
- reverse
- True – descending order
- False – default asc
- key
- a one-argument function that will be applied to each item to produce Its sorting key.
Once your sequences are sorted, they can be very efficiently searched. e.g. the standard binary search algorithm bisect
module
bisect.insort
function – make sure your sorted sequence stay sorted.
Python API convention:
- functions or methods that change an object in-place should return
None
to make it clear to the caller that the object itself was changed and no new object was created.
Managing Ordered Sequences with bisect
bisect
module offers 2 main functions
bisect
insort
Searching with bisect
bisect(haystack, needle)
does a binary search for needle in haystack – which must be a sorted sequence – to locate the position where needle can be inserted while maintaining haystack in ascending order.
The behavior of bisect
can be fine-tuned in two ways
- a pair of optional arguments,
lo
andhi
, allowing narrowing the region in the sequence to be searched when inserting.lo
defaults to 0 andhi
to thelen()
of the sequence bisect
is actually an alias forbisect_right
(also havebisect_left
) difference:bisect_right
returns an insertion point after the existing itembisect_left
returns the position of the existing item
居然比自己写二分慢…不过python一向不准
Inserting with bisect.insort
bisect.insort(seq, item)
inserts item into seq so as to keep seq in ascending order.
When a List is Not the Answer
Arrays
if the list will only contain numbers, an array.array
is more efficient than a list
– it supports all mutable sequence operations (pop, insert, extend) + methods for fast loading and saving (.frombytes, .tofile)
Memory Views
the built-in memorview
class is a shared-memory sequence type that lets you handle slices of arrays wIthout copying bytes.
Numpy and SciPy
Deques and Other Queues
deque
double-ended queue 双端队列
queue
队列 先进先出 First-In-First-Out
stack
堆栈 先进后出 First-In-Last-Out
heapq
priority queue 堆栈 (python built-in的是min heap,如果想要实现max heap可以 -1*val)
其他 standard library packages implement queues:
multiprocessing
- implements its own bounded
queue
quque
designed for interprocess communicationmultiprocessing.JoinableQueue
also available for easier task management
asyncio
- provide
queue
,LifoQueue
,PriorityQueue
andJoinableQueue
- adapted for managing tasks in asynchronous progamming