Python 中文词频统计 | 查找文本中某词出现次数

最新推荐文章于 2024-06-08 10:55:02 发布

577！

最新推荐文章于 2024-06-08 10:55:02 发布

阅读量1.7w

点赞数 21

本文链接：https://blog.csdn.net/hi577/article/details/104867008

版权

import jieba

txt = open("wuxi.txt", encoding="utf-8").read() #'wuxi.txt' 更换你的文件（txt格式）
def jiebafenci(txt,wordslist):
    jieba.load_userdict('tingcibiao.txt')
    words  = jieba.lcut(txt) 
    counts = {}  
    for word in words:  
        counts[word] = counts.get(word,0) + 1  
    lst=[]
    for i in range(len(wordslist)):
        try :
            print(wordslist[i],counts[wordslist[i]])
        except:
            lst.append(wordslist[i])
    print('不存在的词:',lst)
if __name__=='__main__':
    txt = open("wuxi.txt", encoding="utf-8").read() #'wuxi.txt' 更换你的文件（txt格式）
    need_words = open("tingcibiao.txt", encoding="utf-8").read() #这个是要查找的词的txt文件 每个词一行
    find=need_words.split()
    jiebafenci(txt,find)

首先安装jieba库，打开Anaconda Prompt （或其他编辑器）输入pip install jieba

tingcibiao.txt 文件内容如下图（停词表另有含义，此处为不规范命名）

统计结果为

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

577！

关注关注

21
点赞
踩
141

收藏

觉得还不错? 一键收藏
10
评论
Python 中文词频统计 | 查找文本中某词出现次数

import jiebatxt = open("wuxi.txt", encoding="utf-8").read() #'wuxi.txt' 更换你的文件（txt格式）def jiebafenci(txt,wordslist): words = jieba.lcut(txt) jieba.load_userdict('tingcibiao.txt')#这个是要查找的词...
复制链接

扫一扫