请统计hamlet.txt文件中出现的英文单词情况，统计并输出出现最多的前n个单词，注意：‪‬‪‬‪‬‪‬‪‬‮‬‪‬‭‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‪‬‪‬‪‬‪‬‪‬‪‬‮‬‫‬‫‬‪‬‪‬‪‬‪

嘉陵妹妹

于 2022-11-01 15:28:50 发布

阅读量4.6k

点赞数 6

分类专栏： python 文章标签： python

本文链接：https://blog.csdn.net/X131644/article/details/127634705

版权

python 专栏收录该内容

10 篇文章

订阅专栏

请统计hamlet.txt文件中出现的英文单词情况，统计并输出出现最多的前n个单词，注意：‪‬‪‬‪‬‪‬‪‬‮‬‪‬‭‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‪‬‪‬‪‬‪‬‪‬‪‬‮‬‫‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‮‬

(1) 单词不区分大小写，即需将大写转换成小写；‪‬‪‬‪‬‪‬‪‬‮‬‫‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‪‬‪‬‪‬‪‬‪‬‪‬‮‬‫‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‪‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‭‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‪‬‪‬‪‬‪‬‪‬‪‬‮‬‫‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‮‬

(2) 请在文本中剔除如下特殊符号：!"#$%&()*+,-./:;<=>?@[\]^_‘{|}~‪‬‪‬‪‬‪‬‪‬‮‬‪‬‭‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‪‬‪‬‪‬‪‬‪‬‪‬‮‬‫‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‮‬

(3) 输出n个单词和其出现次数，每个单词一行；‪‬‪‬‪‬‪‬‪‬‮‬‪‬‭‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‪‬‪‬‪‬‪‬‪‬‪‬‮‬‫‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‮‬

(4) 输出单词为小写形式。

此题不涉及编码转换，若想指定编码可在开始加上

#-- coding: utf-8 --

或在文件打开处指定编码

with open(“hamlet.txt”, “r”, encoding=‘utf-8’) as f：

 ........

 ........

【输入形式】

【输出形式】

以下仅是输出样例（仅列出3个，需要列出n个），不是最终结果：‪‬‪‬‪‬‪‬‪‬‮‬‪‬‭‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‪‬‪‬‪‬‪‬‪‬‪‬‮‬‫‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‮‬

the 1138

and 965

to 754

单词左对齐，并占10个位置；次数右对齐，并占5个位置

【样例输入】
【样例输出】
【样例说明】
【评分标准】

def getText():
    txt = open("hamlet.txt","r").read()#打开文本
    txt = txt.lower()
    for ch in '!"#$%&()*+,-./:;<=>?@[\\]^_‘{|}~':
        txt = txt.replace(ch," ")
    #归一 去噪
    return txt
n=eval(input())
hamletTxt = getText()
counts = {}
words= hamletTxt.split()
for word in words:
    counts[word] = counts.get(word,0)+1
items = list(counts.items())#返回元组类型元素的列表
items.sort(key=lambda x:x[1],reverse = True)#倒序 对键值进行排序
for i in range(n):
    word,count = items[i]
    print("{:<10}{:>5}".format(word,count))