使用字典对象
1.字典是一个键值对形式,是一个哈希映射,我们可以把任何不可变的数据类型当作键。同样,值可以是任意数据类型,包括自定义的类。
示例代码1:采用for循环统计词频,if条件判断为新键设置初始值
#注:定义跨多行的String有两种方法:
# 1)在每行末加\ 续行符,但是高版本python可能会不支持此方式,且每次都要在行最后加上续行符,不够简洁;
# 2)在Sting前后加上括号;
#1.加载一个句子到变量中
sentence = ("Peter Pipper Picked a peck of pickled peppers A peck of pickled peppers Peter piper picked If Peter Piper picked a peck of pickled peppers Wheres the peck of pickled peppers Peter Piper picked")
#2.初始化一个字典对象
word_dict = {}
#3.执行对词频的统计
# 'str' object is not callable,so we need use split to transfer str to list
for word in sentence.split():
# 可以通过该语句设置默认值,word_dict.setdefault(word,0),防止对字典新键操作报错。
if word not in word_dict:
word_dict[word] = 1
else:
word_dict[word]+=1
#4.打印输出词频结果
print (word_dict)
输出结果:
{'Peter': 4, 'Pipper': 1, 'Picked': 1, 'a': 2, 'peck': 4, 'of': 4, 'pickled': 4, 'peppers': 4, 'A': 1, 'piper': 1, 'picked': 3, 'If': 1, 'Piper': 2, 'Wheres': 1, 'the': 1}
示例代码2:采用for循环统计词频,collections模块中defaultdict类初始化字典,传入int函数作为参数。
#Python2.5以上版本的collections模块中,含有defaultdict类。
from collections import defaultdict
sentence = ("Peter Pipper Picked a peck of pickled peppers A peck of pickled peppers Peter piper picked If Peter Piper picked a peck of pickled peppers Wheres the peck of pickled peppers Peter Piper picked")
#采用defalutdict初始化字典,其中参数int是一个函数,当字典遇到一个没有遇到的键时,它将int()函数返回值来初始化这个键,本例中这个值是0.
word_dict = defaultdict(int)
# 'str' object is not callable,so we need use split to transfer str to list
for word in sentence.split():
word_dict[word]+=1
#keys()函数可以遍历所有的键,values()函数可以遍历所有的值,items()函数可以遍历所有的键值对。
for key,value in word_dict.items():
print (key,':',value)
输出结果:Peter : 4
Pipper : 1
Picked : 1
a : 2
peck : 4
of : 4
pickled : 4
peppers : 4
A : 1
piper : 1
picked : 3
If : 1
Piper : 2
Wheres : 1
the : 1
知识扩充:标准的字典不会记住键被添加进来的顺序,在collections模块中,Python中提供了一个能记住键被添加的顺序的容器,叫做OrderedDict。
示例代码3:采用collections模块下Counter类统计词频,不需要循环、判断了。
from collections import Counter
sentence = ("Peter Pipper Picked a peck of pickled peppers A peck of pickled peppers Peter piper picked If Peter Piper picked a peck of pickled peppers Wheres the peck of pickled peppers Peter Piper picked")
words=sentence.split()
word_count=Counter(words)
print ('Peter',word_count['Peter'])
print (word_count)
输出结果:
Peter 4
Counter({'Peter': 4, 'peck': 4, 'of': 4, 'pickled': 4, 'peppers': 4, 'picked': 3, 'a': 2, 'Piper': 2, 'Pipper': 1, 'Picked': 1, 'A': 1, 'piper': 1, 'If': 1, 'Wheres': 1, 'the': 1})