python实现词频统计

最新推荐文章于 2022-05-14 19:01:42 发布

懒散的鱼与消失的猫

最新推荐文章于 2022-05-14 19:01:42 发布

阅读量1.1k

点赞数

分类专栏： Python

本文链接：https://blog.csdn.net/qwxwaty/article/details/80442040

版权

Python 专栏收录该内容

10 篇文章 1 订阅

订阅专栏

import string
path1='/Users/Administrator/Documents/Walden.txt'
path2='/Users/Administrator/Documents/result.txt'
with open(path1,'r',encoding='UTF-8') as text,open(path2, 'w',encoding='UTF-8')as file:
    words = [raw_word.strip(string.punctuation).lower() for raw_word in text.read().split()]
    words_index = set(words)
    counts_dict = {index: words.count(index) for index in words_index}
    for word in sorted(counts_dict, key=lambda x: counts_dict[x], reverse=True):
        file.write('{} -- {} times \n'.format(word,counts_dict[word]))
    file.close()

其中，path1，path2为路径。

注意：

with open(path,'r',encoding='UTF-8') as text:

encoding='UTF-8'

必不可少，否则，读取txt文本出现“ 'gbk' codec can't decode byte 0xbf in position 2: illegal multibyte sequence”错误。

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

懒散的鱼与消失的猫

关注关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
python实现词频统计

import stringpath1='/Users/Administrator/Documents/Walden.txt'path2='/Users/Administrator/Documents/result.txt'with open(path1,'r',encoding='UTF-8') as text,open(path2, 'w',encoding='UTF-8')as file...
复制链接

扫一扫