统计词频并输出高频词汇

@统计词频并输出高频词汇
所给数据为某日中国日报英文版的一篇新闻报道,现要求使用 Python 语言编写程序统计其中出线频率最高的十个单词,输出对应的单词内容和频率(以字典形式呈现)。

import jieba
import os
file =open("./dataset/englishgraph.txt","r",encoding="utf-8",)
txt = file.read()
words = jieba.lcut(txt)
counts = {}

for word in words:
    if len(word)>=2:
        counts[word] = counts.get(word,0) + 1

list = list(counts.items())
list.sort(key=lambda x:x[1],reverse=True)
print(list)

输出结果

[('you', 14), ('to', 10), ('want', 5), ('have', 5), ('the', 5), ('enough', 4), ('make', 4), ('of', 4), ('those', 4), ('who', 4), ('that', 3), ('and', 3), ('for', 3), ('in', 2), ('life', 2), ('just', 2), ('them', 2), ('what', 2), ('go', 2), ('be', 2), ('only', 2), ('one', 2), ('do', 2), ('it', 2), ('hurts', 2), ('people', 2), ('everything', 2), ('they', 2), ('There', 1), ('are', 1), ('moments', 1), ('when', 1), ('miss', 1), ('someone', 1), ('so', 1), ('much', 1), ('pick', 1), ('from', 1), ('your', 1), ('dreams', 1), ('hug', 1), ('real', 1), ('Dream', 1), ('dream', 1), ('where', 1), ('because', 1), ('chance', 1), ('all', 1), ('things', 1), ('May', 1), ('happiness', 1), ('sweet', 1), ('trials', 1), ('strong', 1), ('sorrow', 1), ('keep', 1), ('human', 1), ('hope', 1), ('happy', 1), ('Always', 1), ('put', 1), ('yourself', 1), ('others', 1), ('shoes', 1), ('If', 1), ('feel', 1), ('probably', 1), ('other', 1), ('person', 1), ('too', 1), ('The', 1), ('happiest', 1), ('don', 1), ('necessarily', 1), ('best', 1), ('most', 1), ('comes', 1), ('along', 1), ('their', 1), ('way', 1), ('Happiness', 1), ('lies', 1), ('cry', 1), ('hurt', 1), ('searched', 1), ('tried', 1), ('can', 1), ('appreciate', 1), ('importance', 1)]

Process finished with exit code 0
  • 0
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值