停用词过滤

最新推荐文章于 2022-03-24 16:16:42 发布

张一爻

最新推荐文章于 2022-03-24 16:16:42 发布

阅读量624

点赞数

分类专栏： python代码整合

本文链接：https://blog.csdn.net/weixin_43069769/article/details/107687848

版权

python代码整合专栏收录该内容

115 篇文章 17 订阅

订阅专栏

stop_word_path = "InferenceSystem/src/I5_algorithm/NLP数据集合/停词库/stop_word_for_chinese.txt"

def del_element(strings,symbles):
    srcrep = {i:'' for i in symbles }
    rep = dict((re.escape(k), v) for k, v in srcrep.items())
    pattern = re.compile("|".join(rep.keys()))
    return pattern.sub(lambda m: rep[re.escape(m.group(0))], strings)

def filter_stop_word(strings,stop_word=np.loadtxt(stop_word_path,dtype=str)):
    return del_element(strings,stop_word)

src = '资源来源网络，侵删 很好的资料，赶快学起来'
filter_stop_word(src)

在这里插入图片描述

filter_stop_word(src,'， ')

在这里插入图片描述

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

张一爻

关注关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
停用词过滤

stop_word_path = "InferenceSystem/src/I5_algorithm/NLP数据集合/停词库/stop_word_for_chinese.txt"def del_element(strings,symbles): srcrep = {i:'' for i in symbles } rep = dict((re.escape(k), v) for k, v in srcrep.items()) pattern = re.compile("|".joi
复制链接

扫一扫