Python collections中的Counter（持续更新）

最新推荐文章于 2024-09-24 06:15:00 发布

有石为玉

最新推荐文章于 2024-09-24 06:15:00 发布

阅读量1.7k

点赞数 2

本文链接：https://blog.csdn.net/weixin_41770169/article/details/80082760

版权

python 专栏收录该内容

121 篇文章 12 订阅

订阅专栏

参考文章：https://blog.csdn.net/Shiroh_ms08/article/details/52653385

一、collections整体介绍

collections：高性能容器的数据类型

在2.4版本中新加入，源代码Lib/collections.py和Lib/_abcoll.py。该模块实现了专用的容器数据类型来替代python的通用内置容器：dict（字典），list（列表）， set（集合）和tuple（元组）。

容器	描述	引入版本
namedtuple()	使用工厂方法创建带有命名的字段的元组的子类	2.6
deque	类似列表的容器，能够快速响应在任何一端进行pop	2.4
Counter	字典子类，为可以进行哈希的对象计数	2.7
OrderedDict	字典子类，记录了字典的添加次序	2.7
defaultdict	字典子类，调用一个工厂方法来提供缺失的值	2.5

除了具体的容器类，collections模块还提供了abstract_base_classes来测试一个类是否体用了一个特定的接口，例如，这是可哈希的还是一个映射。

二、Counter作用及举例

counter工具用于支持便捷和快速地计数。

1、简单示例

from collections import Counter
cnt = Counter()
for word in ['red', 'blue', 'red', 'green', 'blue', 'blue']:
    cnt[word] += 1
print cnt

输出为

Counter({'blue': 3, 'red': 2, 'green': 1})

2、影评中的正负评价高频词统计

from collections import Counter
positive_counts = Counter()
negative_counts = Counter()
total_counts = Counter()
for i in range(len(reviews)):
    if(labels[i] == 'POSITIVE'):
        for word in reviews[i].split(" "):
            positive_counts[word] += 1
            total_counts[word] += 1
    else:
        for word in reviews[i].split(" "):
            negative_counts[word] += 1
            total_counts[word] += 1
positive_counts.most_common()
negative_counts.most_common()