给一段文本，例如：“who have an apple apple is free free is money you know”，请统计单词出现的次数。（提示：需要用正则表达式去掉标点符号和空格）

最新推荐文章于 2024-07-20 17:30:10 发布

CC_changcheng

最新推荐文章于 2024-07-20 17:30:10 发布

阅读量511

点赞数 1

本文链接：https://blog.csdn.net/weixin_45710713/article/details/121172468

版权

实验题目：统计词频（选做）给一段文本，例如：“who have an apple apple is free free is money you know”，请统计单词出现的次数。（提示：需要用正则表达式去掉标点符号和空格）

import re
reg = "[^0-9A-Za-z\u4e00-\u9fa5]"
word = "hello world!I'am cc.hello cc!"
#除去标点符号
x = re.sub(reg, ' ', word)
print(x)

#把单词提取出来，存到列表
list = []
str = ''
n=0
for i in x:
    if i != ' ':
        str = str+i
    elif i == ' ':
        list.append(str)
        str = ''
    else:
        pass

存放到集合中，key为单词，value为个数
set={}
for i in list:
    if i not in set.keys():
        set[i] = list.count(i)

print(set)
————————————————
版权声明：本文为CSDN博主「常丨CHENG」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/weixin_45710713/article/details/121172385

CC_changcheng

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
打赏
0
评论
给一段文本，例如：“who have an apple apple is free free is money you know”，请统计单词出现的次数。（提示：需要用正则表达式去掉标点符号和空格）

实验题目：统计词频（选做）给一段文本，例如：“who have an apple apple is free free is money you know”，请统计单词出现的次数。（提示：需要用正则表达式去掉标点符号和空格）import rereg = "[^0-9A-Za-z\u4e00-\u9fa5]"word = "hello world!I'am cc.hello cc!"#除去标点符号x = re.sub(reg, ' ', word)print(x)#把单词提取出来，存到列表
复制链接

扫一扫