Python学习：词频统计（zip、set、split、len 函数）的应用

最新推荐文章于 2024-05-02 07:00:00 发布

zl202111

最新推荐文章于 2024-05-02 07:00:00 发布

阅读量1.7k

点赞数

分类专栏： Python 文章标签： python

本文链接：https://blog.csdn.net/zl202111/article/details/121459541

版权

Python 专栏收录该内容

20 篇文章 3 订阅

订阅专栏

使用zip、set、split、len 函数及应用

词频统计

词频统计

一、编辑字符串

1、字符链接 — zip( )

请添加图片描述

2、创建不重复集合— set( )

请添加图片描述

3、字符串分隔 — split( )

请添加图片描述

4、获取字符串长度 — len( )

通过len()计算字符串长度时，不区分英文、数字和汉字，都按一个字符计算。
采用utf-8编码的字符串，一个汉字当3个字节。
采用gbk编码的字符串，一个汉字当2个字节。

请添加图片描述

二、词频统计

# -*- coding: utf-8 -*-
'''
功能：词频统计
作者：zwh
日期：2021/11/21
'''

text = 'I love python I love java I learn python'
# 拆分
words = text.split(' ')
# 去重
diff_words = list(set(words))

# 统计单词个数的列表
counts = []
for i in range(len(diff_words)):
    counts.append(0)

# 遍历单词列表，统计各个单词的个数
for i in range(len(words)):
    for j in range(len(diff_words)):
        if diff_words[j] == words[i]:
            counts[j] = counts[j] + 1

# 输出统计结果
for word_count in zip(diff_words, counts):
    print(word_count)

学习提示：

Details determine success or failure！
细节决定成败！

zl202111

关注

0
点赞
踩
7

收藏

觉得还不错? 一键收藏
0
评论
Python学习：词频统计（zip、set、split、len 函数）的应用

使用zip、set、split、len 函数及应用词频统计一、编辑字符串1、字符链接 — zip( )2、创建不重复集合— set( )3、字符串分隔 — split( )4、获取字符串长度 — len( )二、词频统计![请添加图片描述](https://img-blog.csdnimg.cn/e40645ae75464c8c8807e1992b80d8f0.png?x-oss-process=image/watermark,type_ZHJvaWRzYW5zZmFsbGJhY2s,shadow_50,t
复制链接

扫一扫