如何统计序列中元素出现的频度_用什么图展示长序列中出现的频次-CSDN博客

本文链接：https://blog.csdn.net/sinat_35930259/article/details/79522268

举例

1、统计随机序列中出现次数最高的三个元素，并显示它们出现的次数。

2、统计英文文章中出现次数最高的10个单词，并显示它们出现的频次。

统计序列中出现次数最高的三个单词

最终的统计结果一定为一个字典。

一般方法

首先创建一个包含所有元素的，且值为0的字典。然后循环遍历序列，每过一个元素就给字典中的对应键的值加1，最后根据值排序取出最多的三个。代码如下：

from random import randint

data = [randint(0, 20) for _ in xrange(30)]

c = dict.fromkeys(data, 0)
for x in data:
    c[x] += 1

print sorted(c.items(), key=lambda v: v[1])[-3:]

使用collections.Counter对象

将序列传入Counter的构造器，得到Counter对象是元素频度的字典。Counter.most_common(n)方法得到最高的n个元素。代码如下：

from random import randint
from collections import Counter

data = [randint(0, 20) for _ in xrange(30)]

c2 = Counter(data)
res = c2.most_common(3)
print res

英文文章词频统计

类似于上面的方法，也是使用collections.Counter来进行统计，代码如下：

from collections import Counter
import re

txt = open('path/file').read()
c3 = Counter(re.split('\W+', txt))
res = c3.most_common(10)
print res