python学习记录（三）【文本词频统计】

最新推荐文章于 2023-10-25 22:16:54 发布

这里是一只小小琪

最新推荐文章于 2023-10-25 22:16:54 发布

阅读量203

点赞数

分类专栏： python

本文链接：https://blog.csdn.net/qq_41925919/article/details/103164830

版权

python 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

def gettxt():
    txt=open("新建文本文档.txt","r").read()#打开文本
    txt.lower()#全部转换为小写
    for ch in '!"#$%&()*+,-./:;<=>?@[\\]^_‘{|}~': #去掉特殊符号
        txt.replace(ch,' ')
    return txt

txt = gettxt()
words = txt.split() #以空格为分割符分割文本
ji={}
for w in words:
    ji[w]=ji.get(w,0)+1 #词频统计（字典类型）
Is = list(ji.items()) #用列表类型进行排序
Is.sort(key=lambda x:x[1], reverse=True)
for i in range(10):  #输出前十个
    word,count=Is[i]
    print("{0:<10}{1:>5}".format(word, count))

运行结果：
在这里插入图片描述

1.注意字典类型不能排序，排序需要转为列表类型，然后排序
2.字典类型可以联想C中的map
3.对string的各种使用

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

这里是一只小小琪

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python学习记录（三）【文本词频统计】

def gettxt(): txt=open("新建文本文档.txt","r").read()#打开文本 txt.lower()#全部转换为小写 for ch in '!"#$%&()*+,-./:;<=>?@[\\]^_‘{|}~': #去掉特殊符号 txt.replace(ch,' ') return txttxt = g...
复制链接

扫一扫