一、序列类型Sequence Types
Python提供了5中内置的序列类型,分别是bytearray, bytes, list, str, and tuple,其中前两者会在第7章文件处理时会用到,其他序列类型由标准库提供,例如collections.namedtuple。这一节主要介绍tuples, named tuples, and lists。
1、元组Tuples
与string类似,元组不可修改,如若想修改元组,利用list函数使其转换成list数据类型。tuple()函数返回空元组。
——Shallow and deep copying
t.count(x) 函数返回t元组中x对象出现的次数
t.index(x) 函数返回t元组中x对象第一次出现的索引位置(如果没有引发ValueError异常)
示例:
>>> hair = "black","brown", "blonde", "red"
>>> hair[:2], "gray",hair[2:]
(('black', 'brown'), 'gray', ('blonde','red'))
>>> hair[:2] + ("gray",)+ hair[2:] #返回包含所有项的单个元组(concatenate tuples)
('black', 'brown', 'gray', 'blonde', 'red')
示例(本书的编程风格就是这种,在二元运算符的左边、一元运算符的右边不加括号):
a, b = (1, 2) # left of binary operator
del a, b # right of unary statement
def f(x):
return x, x ** 2 # right ofunary statement
for x, y in ((1, 1), (2, 4), (3, 9)): # left of binary operator
print(x, y)
示例(嵌套元组):
>>> things = (1, -7.5,("pea", (5, "Xyz"), "queue"))
>>> things[2][1][1][2]
'z'
嵌套元组中的item数据类型可以是任意类型,嵌套太深容易让人迷惑,可以使用这种办法:
>>> MANUFACTURER, MODEL, SEATING =(0, 1, 2)
>>> MINIMUM, MAXIMUM = (0, 1)
>>> aircraft =("Airbus", "A320-200", (100, 220))
>>> aircraft[SEATING][MAXIMUM]
220
2、Named Tuples
Python对象可以替代Named Tuples
3、List 列表
与string、tuple不同,list是可变的,我们可以再列表上进行插入、替换、删除操作。此外列表可以被嵌套、迭代、切片,与tuple相同。
Table 3.1. List Methods | |
Syntax | Description |
L.append(x) | Appends item x to the end of list L |
L.count(x) | Returns the number of times item x occurs in list L |
L.extend(m) L += m | Appends all of iterable m's items to the end of list L; the operator += does the same thing |
L.index(x, start, end) | Returns the index position of the leftmost occurrence of item x in list L (or in the start:end slice of L); otherwise, raises a ValueError exception |
L.insert(i, x) | Inserts item x into list L at index position int i |
L.pop() | Returns and removes the rightmost item of list L |
L.pop(i) | Returns and removes the item at index position int i in L |
L.remove(x) | Removes the leftmost occurrence of item x from list L, or raises a ValueError exception if x is not found |
L.reverse() | Reverses list L in-place |
L.sort(...) | Sorts list L in-place; this method accepts the same key and reverse optional arguments as the built-in sorted() |
——unpacking operator
>>> first, *rest = [9, 2, -4, 8,7]
>>> first, rest
(9, [2, -4, 8, 7])
>>> first, *mid, last ="Charles Philip Arthur George Windsor".split()
>>> first, mid, last
('Charles', ['Philip', 'Arthur', 'George'],'Windsor')
>>> *directories, executable = "/usr/local/bin/gvim".split("/")
>>> directories, executable
(['', 'usr', 'local', 'bin'], 'gvim')
——增加项
woods= ["Cedar", "Yew", "Fir"],表中两种操作的结果是一样的:
woods += ["Kauri", "Larch"] | woods.extend(["Kauri", "Larch"]) |
woods =['Cedar', 'Yew', 'Fir', 'Kauri', 'Larch']
——修改项
——删除项
4、List Comprehensions
***
二、集合类型Set Types
集合支持成员操作符in,size()函数,还支持set.isdisjoint()函数、比较函数和位运算符(适用于并集和交集的计算),Python提供两个内置的set类型,可变的set类型和不可变的frozenset类型。
只有hashable对象被加入集合中,Hashable对象拥有__hash__()特别方法和__sq__()方法。
内置的可变数据类型:float, frozenset, int, str, and tuple是hashable的,所以可以加入set,与此同时内置的不可变数据类型:dict, list不可以加入set。
——Sets
Set是可以改变的,可以添加和删除元素,但是其内部无序,所以不能根据索引访问元素
S = {7, "veil", 0, -29,("x", 11), "sun", frozenset({8, 4, 7}), 913},注意是花括号
Table 3.2. Set Methods and Operators
Syntax | Description |
s.add(x) | Adds item x to set s if it is not already in s |
s.clear() | Removes all the items from set s |
s.copy() | Returns a shallow copy of set s |
s.difference(t) s - t | Returns a new set that has every item that is in set s that is not in set t |
s.difference_update(t) s -= t | Removes every item that is in set t from set s |
s.discard(x) | Removes item x from set s if it is in s; see also set.remove() |
s.intersection(t) s & t | Returns a new set that has each item that is in both set s and set t |
s.intersection_update(t) s &= t | Makes set s contain the intersection of itself and set t |
s.isdisjoint(t) | Returns TRue if sets s and t have no items in common |
s.issubset(t) s <= t | Returns true if set s is equal to or a subset of set t; use s < t to test whether s is a proper subset of t |
s.issuperset(t) s >= t | Returns true if set s is equal to or a superset of set t; use s > t to test whether s is a proper superset of t |
s.pop() | Returns and removes a random item from set s, or raises a KeyError exception if s is empty |
s.remove(x) | Removes item x from set s, or raises a KeyError exception if x is not in s; see also set.discard() |
s.symmetric_difference(t) s ^ t | Returns a new set that has every item that is in set s and every item that is in set t, but excluding items that are in both sets |
s.symmetric_difference_update(t) s ^= t | Makes set s contain the symmetric difference of itself and set t |
s.union(t) s | t | Returns a new set that has all the items in set s and all the items in set t that are not in set s |
s.update(t) s |= t | Adds every item in set t that is not in set s, to set s |
This method and its operator (if it has one) can also be used with frozensets. |
Set的一种常见的用途是快速的成员测试:
if len(sys.argv) == 1 or sys.argv[1] in{"-h", "--help"}:
另一种常见用于确保不处理重复的数据:
for ip in set(ips):
process_ip(ip)
另一种常见的用途是除掉不想要的项
filenames = set(filenames)
for makefile in {"MAKEFILE","Makefile", "makefile"}:
filenames.discard(makefile)
与之等价的语句:filenames = set(filenames) - {"MAKEFILE","Makefile", "makefile"}
——Set Comprehensions
{expression for item in iterable}
{expression for item in iterable ifcondition}
三、映射类型Mapping Types
Python提供了两种映射类型,内置的字典类型dict和标准库的collections.defaultdict。只有哈希对象可以作为字典的键,所以不可变的数据类型如float,frozenset,int,str和tuple可以作为字典的键,但是可变类型,如字典,列表和set不能。
Dictionaries字典
生成字典的语法示例:
l d1 = dict({"id": 1948, "name":"Washer", "size": 3})
l d2 = dict(id=1948, name="Washer", size=3)
l d3 = dict([("id", 1948), ("name","Washer"), ("size", 3)])
l d4 = dict(zip(("id", "name", "size"),(1948, "Washer", 3)))
l d5 = {"id": 1948, "name": "Washer","size": 3}
Table3.3. Dictionary Methods
Syntax | Description |
d.clear() | Removes all items from dict d |
d.copy() | Returns a shallow copy of dict d |
d.fromkeys(s, v) | Returns a dict whose keys are the items in sequence s and whose values are None or v if v is given |
d.get(k) | Returns key k's associated value, or None if k isn't in dict d |
d.get(k, v) | Returns key k's associated value, or v if k isn't in dict d |
d.items() | Returns a view[*] of all the (key, value) pairs in dict d |
d.keys() | Returns a view[*] of all the keys in dict d |
d.pop(k) | Returns key k's associated value and removes the item whose key is k, or raises a KeyError exception if k isn't in d |
d.pop(k, v) | Returns key k's associated value and removes the item whose key is k, or returns v if k isn't in dict d |
d.popitem() | Returns and removes an arbitrary (key, value) pair from dict d, or raises a KeyError exception if d is empty |
d.setdefault(k, v) | The same as the dict.get() method, except that if the key is not in dict d, a new item is inserted with the key k, and with a value of None or of v if v is given |
d.update(a) | Adds every (key, value) pair from a that isn't in dict d to d, and for every key that is in both d and a, replaces the corresponding value in d with the one in a—a can be a dictionary, an iterable of (key, value) pairs, or keyword arguments |
d.values() | Returns a view[*] of all the values in dict d |
遍历字典:
for item in d.items():
print(item[0], item[1])
for key, value in d.items():
print(key, value)
Dictionary Comprehensions
Default Dictionaries
Default dictionaries与字典(Plain Dictionaries)有相同的操作符和方法,唯一不同的是它们键缺失的处理方式。比较下表两个代码段的不同:
words是Plain Dictionarie words[word] = words.get(word, 0) + 1
| words是Default dictionaries words = collections.defaultdict(int) words[word] += 1 |
四、迭代和拷贝集合Iterating and Copying Collections
——迭代器、可迭代操作和函数(Iterators and Iterable Operations and Functions)
iterable data type(可迭代数据类型),有__iter__()方法,可提供迭代器;
Iterator是迭代器提供__next__()method,迭代结束引发StopIteration exception
Table3.4. Common Iterable Operators and Functions
Syntax | Description |
s + t | Returns a sequence that is the concatenation of sequences s and t |
s * n | Returns a sequence that is int n concatenations of sequence s |
x in i | Returns TRue if item x is in iterable i; use not in to reverse the test |
all(i) | Returns true if every item in iterable i evaluates to true |
any(i) | Returns true if any item in iterable i evaluates to TRue |
enumerate(i, start) | Normally used in for ... in loops to provide a sequence of (index, item) tuples with indexes starting at 0 or start; see text |
len(x) | Returns the "length" of x. If x is a collection it is the number of items; if x is a string it is the number of characters. |
max(i, key) | Returns the biggest item in iterable i or the item with the biggest key(item) value if a key function is given |
min(i, key) | Returns the smallest item in iterable i or the item with the smallest key(item) value if a key function is given |
range(start, stop, step) | Returns an integer iterator. With one argument (stop), the iterator goes from 0 to stop - 1; with two arguments (start, stop) the iterator goes from start to stop - 1; with three arguments it goes from start to stop - 1 in steps of step. |
reversed(i) | Returns an iterator that returns the items from iterator i in reverse order |
sorted(i, key, reverse) | Returns a list of the items from iterator i in sorted order; key is used to provide DSU (Decorate, Sort, Undecorate) sorting. If reverse is TRue the sorting is done in reverse order. |
sum(i, start) | Returns the sum of the items in iterable i plus start (which defaults to 0); i may not contain strings |
zip(i1, ..., iN) | Returns an iterator of tuples using the iterators i1 to iN; see text |
当使用for item in iterable循环语句时,Python内部实际上调用iter(iterable)获得一个迭代器:
product = 1 for i in [1, 2, 4, 8]: product *= i print(product) # prints: 64
| product = 1 i = iter([1, 2, 4, 8]) while True: try: product *= next(i) except StopIteration: break print(product) # prints: 64 |
——enumerate()函数的用法:
参数时迭代器,返回enumerator对象,该对象本身也可以是迭代器,每一次迭代返回一个2-tuple,元组中第一项是iteration number(默认从0开始),并且the second item the next item from the iterator enumerate() wascalled on。
if len(sys.argv) < 3: print("usage: grepword.py word infile1 [infile2 [... infileN]]") sys.exit()
word = sys.argv[1] for filename in sys.argv[2:]: for lino, line in enumerate(open(filename), start=1): if word in line: print("{0}:{1}:{2:.40}".format(filename, lino, line.rstrip())) |
unpack an iterable对可迭代对象的“解引用”操作有* 和range,示例如下(calculate是接受4个参数的函数):
calculate(1, 2, 3, 4)
t = (1, 2, 3, 4)
calculate(*t)
calculate(*range(1, 5))
——sorted函数和reversed函数
另外两个和迭代相关的函数,sorted函数返回一个拷贝,reversed函数返回一个逆向迭代器
>>> list(range(6))
[0, 1, 2, 3, 4, 5]
>>> list(reversed(range(6)))
[5, 4, 3, 2, 1, 0]
其中sorted()函数的用法更复杂一些,该函数应用的示例有:
>>> x = [] >>> for t in zip(range(-10, 0, 1), range(0, 10, 2), range(1, 10, 2)): ... x += t >>> x [-10, 0, 1, -9, 2, 3, -8, 4, 5, -7, 6, 7, -6, 8, 9] >>> sorted(x) [-10, -9, -8, -7, -6, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> sorted(x, reverse=True) [9, 8, 7, 6, 5, 4, 3, 2, 1, 0, -6, -7, -8, -9, -10] >>> sorted(x, key=abs) [0, 1, 2, 3, 4, 5, 6, -6, -7, 7, -8, 8, -9, 9, -10] |
两段代码在功能上是等价的:
x = sorted(x, key=str.lower)
| temp = [] for item in x: temp.append((item.lower(), item)) x = [] for key, value in sorted(temp): x.append(value) |
Python提供的排序算法是自适应的稳定的归并排序算法(adaptive stable mergesort),Python排序是用的是”<”,集合内部嵌套集合,Python的排序算法同样要给排序。
——Copying Collections
浅拷贝 | 深拷贝 |
浅拷贝初始: >>> songs = ["Because", "Boys", "Carol"] >>> beatles = songs >>> beatles, songs (['Because', 'Boys', 'Carol'], ['Because', 'Boys', 'Carol']) >>> beatles[2] = "Cayenne" >>> beatles, songs (['Because', 'Boys', 'Cayenne'], ['Because', 'Boys', 'Cayenne']) | >>> x = [53, 68, ["A", "B", "C"]] >>> y = x[:] # shallow copy >>> x, y ([53, 68, ['A', 'B', 'C']], [53, 68, ['A', 'B', 'C']]) >>> y[1] = 40 >>> x[2][0] = 'Q' >>> x, y ([53, 68, ['Q', 'B', 'C']], [53, 40, ['Q', 'B', 'C']])
与之对比 >>> import copy >>> x = [53, 68, ["A", "B", "C"]] >>> y = copy.deepcopy(x) >>> y[1] = 40 >>> x[2][0] = 'Q' >>> x, y ([53, 68, ['Q', 'B', 'C']], [53, 40, ['A', 'B', 'C']])
|
浅拷贝进一步: 对于字典dict和集合而言 dict.copy() and set.copy() copy模块的copy()方法同样返回对象的一份拷贝 另一种办法就是,对于内置类型的拷贝,可以把其为参数传递给类型同名函数,示例: copy_of_dict_d = dict(d) copy_of_list_L = list(L) copy_of_set_s = set(s)
|