统计一个文本中单词频次最高的 10 个单词？

最新推荐文章于 2022-08-26 09:48:48 发布

weixin_30364325

最新推荐文章于 2022-08-26 09:48:48 发布

阅读量478

点赞数 1

文章标签： python 大数据数据结构与算法

原文链接：http://www.cnblogs.com/ff-gaofeng/p/11424281.html

版权

#统计一个文本中单词频次最高的 10 个单词？

import re
import string

word_list =[]
word_dict ={}
with open("d:\\2.py","r") as fp:
    fp_file = fp.readlines()
    for line in fp_file:
        if line.strip() != '':
            line_word = re.findall(r"[a-zA-Z]+",line)  #l利用切片把Word取出来，返回是一个list
            word_list += line_word   #把Word汇总成一个list

for word in word_list:
    if word in word_dict:
        word_dict[word] += 1
    else:
        word_dict[word] = 1

#对字典按value进行排序，并去除前10个数据
sorted_word_dict = sorted(word_dict.items(),key = lambda x:x[1],reverse =True)[:10]

print(sorted_word_dict)