我们要统计一篇英文文本中指定词汇的词频,比如我们有一篇英文TXT文档
A wet Sunday in a country inn ! Whoever has had the luck to experience one can alone judge of my situation. The rain pattered against the casements; the bells tolled for church with a melancholy sound. I went to the windows in quest of something to amuse the eye; but it seemed as if I had been placed completely out of the reach of ail amusement. The windows of my bed-room looked out among tiled roofs and stacks of chimneys, while those of my sitting-room commanded a full view of the stable yard. I know of nothing more calculated to make a man sick of this world than a stable yard on a rainy day.
我们统计其中the和of出现的频数
#打开文本文件
txt = open('1.txt').read()
#转换为小写
txt = txt.lower()
#替换特殊字符
for ch in '!"@#$%^&*()+,-./:;<=>?@[\\]_`~{|}':
txt.replace(ch, ' ')
#切割为列表格式
txtArr = txt.split()
#构造字典,统计the和of的词频
worddict ={'the': 0, 'of': 0}
#遍历统计
for i in txtArr:
if i in worddict:
worddict[i] = worddict[i]+1
#打印结果
print(worddict)