python--读取特定的txt文件，并统计文件中的词汇

最新推荐文章于 2023-05-12 10:11:44 发布

想学好python的人

最新推荐文章于 2023-05-12 10:11:44 发布

阅读量2.4k

点赞数 1

分类专栏： python学习笔记文章标签： python

本文链接：https://blog.csdn.net/No_PainNo_Gain/article/details/106818944

版权

python学习笔记专栏收录该内容

7 篇文章 0 订阅

订阅专栏

去年刚学python的时候，用open的方法写了一个脚本，简化了我自己测试维护环境的工作量（通过ipop工具输出的回显，去统计回显中我想要的字符）。最近刚好学习嵩天老师的课，讲到了这个方法，用来统计某个文档中的人名。索性自己写了个类，刚好巩固一下。

# -*- mode: python ; coding: utf-8 -*-
class Book():
    """定义一个关于书本txt的类"""
    def __init__(self,route):
        self.route = route

    def output_txt(self):
        """输出指定的文档"""
        output = open(self.route).read()
        output = output.lower()     #英文文档中有大小写，lower函数用来统一成小写，方便统计
        for i in '!"#$%&()*+,-./:;<=>?@[\\]^_‘{|}~':
            output = output.replace(i," ")      #将特数字符进行转换，用空格代替
        output = output.split()
        return output       #通过split函数，在这里返回一个列表

    def count_words(self):
        """统计文档中的词汇"""
        count = {}
        for word in self.output_txt():
            count[word] = count.get(word,0)+1       #字典的get方法用来返回对应键的值，如果没有该键则返回0
        items = list(count.items())
        items.sort(key=lambda x:x[1],reverse=True)
        for i in range(10):
            word,counts = items[i]
            print("{0:<10}{1:>5}".format(word,counts))

txt = Book(r'C:\Users\Vayne\Desktop\english.txt')
txt.count_words()