NLP之滑动窗口函数

最新推荐文章于 2024-06-19 21:53:18 发布

不二郭

最新推荐文章于 2024-06-19 21:53:18 发布

阅读量821

点赞数

本文链接：https://blog.csdn.net/weixin_43957426/article/details/94775151

版权

import re


def compute_ngrams(word):
    # BOW, EOW = ('<', '>')  # Used by FastText to attach to all words as prefix and suffix
    pattern = r'[a-zA-Z]+'
    re.findall(pattern, word)

    extended_word,tag_dict = segword(word)
    # print(extended_word,tag_dict)
    min_n = 2
    max_n = len(extended_word)

    ngrams = []
    for ngram_length in range(min_n, min(len(extended_word), max_n) + 1):
        for i in range(0, len(extended_word) - ngram_length + 1):
            new_word = extended_word[i:i + ngram_length]
            new_word2 = new_word
            if len(new_word) == 1:
                continue
            if len(tag_dict) == 0:
                ngrams.append(new_word)
            else:
                for c in new_word:
                    if c.encode('utf-8').isalpha():
                        new_word2 = new_word2.replace(c,tag_dict[c]+' ')

最低0.47元/天解锁文章

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

不二郭

关注关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
NLP之滑动窗口函数

import redef compute_ngrams(word): # BOW, EOW = ('<', '>') # Used by FastText to attach to all words as prefix and suffix pattern = r'[a-zA-Z]+' re.findall(pattern, word) e...
复制链接

扫一扫