python入门基础实例：python文本词频统计

最新推荐文章于 2024-08-15 23:40:54 发布

mengy7762

最新推荐文章于 2024-08-15 23:40:54 发布

阅读量860

点赞数

分类专栏： python 爬虫程序员文章标签： python 自然语言处理

本文链接：https://blog.csdn.net/mengy7762/article/details/120829913

版权

本文通过Python入门实例，演示如何对英文文本《哈姆雷特》进行词频统计，找出出现最频繁的单词。同时，还探讨了对中文文本《三国》中的人物分析方法。

摘要由CSDN通过智能技术生成

英文文本：hamlet，统计出现最多的英文单词

代码实现：

#Hamlet词频统计
def getText():
    txt = open("hamlet",'r').read()
    txt = txt.lower() #大写字母转换小写
    for word in '~!@#$%^&*()_+-={}[],./:";<>?':
        txt = txt.replace(word," ")#把多余符号转换为空格
    return txt
hamletTxt = getText()
words = hamletTxt.split() #以空格拆分为列表
counts = {
   }
for word in words