1. 元组:不可变列表;使用,(可以加())创建元组
eg:tuple = 1, 'a', 3.2, True 或 tuple = (1, 'a', 3.2, True)
2. 如果元组中只有一个元素正确错误方法:
正确:a = 1, 或者 a = (1,)
错误:a = (1)
3. 为什么需要元组:保证列表内容不被修改
4. 元组赋值:
交换两个值:a, b = b, a
5. 切分一个邮件地址:
name, domain = 'pp@qq.com'.split('@')
6. 函数和元组,同时返回列表中的最大值和最小值:
def max_min(lst):
for i in lst:
if i > max:
max = i
if i < min:
min = i
return max, min
7. Decorate, Sort and Undecorate(DSU)模式,装饰、排序和反装饰:
def sort_by_length(words):
# decorate
t = []
for word in words:
t.append((len(word),word))
# sort
t.sort(reverse = True)
# undecorate
res = []
for lenth, word in t:
res.append(word)
return res
words = ['a', 'abde', 'acfbgi', 'ee']
print sort_by_length(words)
print words
words.sort(key = lambda x: len(x), reverse = True)
print words
8. 字典,类似map,创建字典:
使用{}创建字典
使用:指明 键:值 对: dict = {'anny':88661, 'bob':86541, 'mike':11256}
键必须是不可变的且不重复,值可以使任意类型
9. 访问字典,添加元素:
使用[]运算符, 键作为索引
>>> print dict['anny']
>>>88661
访问不存在的键报错
添加一个新对:dict['Tom'] = 56231
10. 字典运算符和方法:
len(dict): 字典中的键值对数量
key in dict:快速判断key是否为字典中的键:O(1),等价于dict.has_key(key)
for key in dict:枚举字典中的键, 注:键是没有顺序的
dict.items(): 全部键值对
dict.keys():全部的键
dict.values():全部的值
dict.clear():清空字典
>>> dict = {'anny':88661, 'bob':86541, 'mike':11256}
>>> dict
{'bob': 86541, 'mike': 11256, 'anny': 88661}
>>> dict.items()
[('bob', 86541), ('mike', 11256), ('anny', 88661)]
>>> dict.keys()
['bob', 'mike', 'anny']
>>> dict.values()
[86541, 11256, 88661]
>>> key in dict
Traceback (most recent call last):
File "<pyshell#165>", line 1, in <module>
key in dict
NameError: name 'key' is not defined
>>> 'bob' in dict
True
count = {}
for i in 'asdfjklasdjklsd':
if i in count:
count[i] += 1
else:
count[i] = 1
print count
12. 读取文件,打印出现频率最高的10个词:
count = {}
f = open('emma.txt')
for line in f:
line = line.strip()
words = line.split()
for word in words:
if word in count:
count[word] += 1
else:
count[word] = 1
word_f = []
for word, freq in count.items():
word_f.append((freq, word))
word_f.sort(reverse = True)
for freq, word in word_f[:10]:
print word, freq
f.close()
13. 字典翻转:
def reverse_dict(d):
re = {}
for k, v in d.items():
if v in re:
re[v].append(k)
else:
re[v] = [k]
return re
d = {'A':28, 'B':30, 'C':28}
print reverse_dict(d)
14. 集合(Set):
创建:x = set()
添加和删除:x.add('body');x.remove('body')
15. set运算符:-,差集;&,交集;|,并集;!=;==;in;for key in set;
16. 正向最大匹配:
def load_dic(filename):
f = open(filename)
word_dic = set()
max_length = 1
for line in f:
word = unicode(line.strip(), 'utf-8')
word_dic.add(word)
if len(word) > max_length:
max_length = len(word)
f.close()
return max_length, word_dic
def fmm_word_seg(sentence, word_dic, max_length):
begin = 0
words = []
sentence = unicode(sentence, 'utf-8')
while begin < len(sentence):
for end in range(min(begin + max_len, len(sentence)), begin, -1):
word = sentence[begin:end]
if word in word_dic or end == begin + 1:
words.append(word)
break
begin = end
return words
max_len, word_dic = load_dic('lexicon.dic')
words = fmm_word_seg(raw_input(), word_dic, max_len)
for word in words:
print word,
17. 数据结构对比:
string | list | tuple | set | dict | |
Mutable | N | Y | N | Y | Y |
Sequential | Y | Y | Y | N | N |
Sortable | Y | Y | Y | N | N |
Slicable | Y | Y | Y | N | N |
Index/key type | int | int | int | 不可变 | 不可变 |
Item/value type | char | any | any | no | any |
Search | Y | Y | Y | Y | Y |
complexity | O(n) | O(n) | O(n) | O(1) | O(1) |