文本文件的词频统计（包含excludes排除库）

最新推荐文章于 2023-10-25 22:16:54 发布

weixin_36550305

最新推荐文章于 2023-10-25 22:16:54 发布

阅读量3.4k

点赞数 1

本文链接：https://blog.csdn.net/weixin_36550305/article/details/70255673

版权

def getTxt(): txt=open("hamlet.txt","r").read() txt=txt.lower() for ch in '!"#$%&()*+,-./:;?@[\\]^_`{}|~': txt=txt.replace(ch," ") return txthamletTxt=getTxt()words=hamletTxt

摘要由CSDN通过智能技术生成

def getTxt():
    txt=open("hamlet.txt","r").read()
    txt=txt.lower()
    for ch in '!"#$%&()*+,-./:;<=>?@[\\]^_`{}|~':
        txt=txt.replace(ch," ")
    return txt
hamletTxt=getTxt()
words=hamletTxt.split()
counts={}
for word in words:
    counts[word]=counts.get(word,0)+1
items=list(counts.items())
items.sort(key=lambda x:x[1],reverse=True)
excludes=['the','and','to','of','you','i','a','my','in',\
          'it','that','is',' not','his','this','but',\
          'with','for','not','your','me','be','as','he',\
          'what','him','so','have','will','do','no','we',\
          'are&#

最低0.47元/天解锁文章

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

weixin_36550305

关注关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
文本文件的词频统计（包含excludes排除库）

def getTxt(): txt=open("hamlet.txt","r").read() txt=txt.lower() for ch in '!"#$%&()*+,-./:;?@[\\]^_`{}|~': txt=txt.replace(ch," ") return txthamletTxt=getTxt()words=hamletTxt
复制链接

扫一扫