《昇思25天学习打卡营第18天》-CSDN博客

本文链接：https://blog.csdn.net/NewtoPhone/article/details/140561861

今天进入了第18天的学习，本次我们要先从LSTM+CRF序列标注的CRF层开始学习。

让我们先来了解一下什么是CRF层

CRF（条件随机场）是一种常用于序列标注和命名实体识别的神经网络模型，PyTorch中的CRF层扮演重要的作用

注意：考虑到输入序列可能存在Padding的情况，CRF的输入需要考虑输入序列的真实长度，因此除发射矩阵和标签外，加入`seq_length`参数传入序列Padding前的长度，并实现生成mask矩阵的`sequence_mask`方法。

BiLSTM+CRF模型

在实现CRF后，我们设计一个双向LSTM+CRF的模型来进行命名实体识别任务的训练。

模型结构

nn.Embedding -> nn.LSTM -> nn.Dense -> CRF

其中LSTM提取序列特征，经过Dense层变换获得发射概率矩阵，最后送入CRF层。具体实现如下：

class BiLSTM_CRF(nn.Cell):
    def __init__(self, vocab_size, embedding_dim, hidden_dim, num_tags, padding_idx=0):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size, embedding_dim, padding_idx=padding_idx)
        self.lstm = nn.LSTM(embedding_dim, hidden_dim // 2, bidirectional=True, batch_first=True)
        self.hidden2tag = nn.Dense(hidden_dim, num_tags, 'he_uniform')
        self.crf = CRF(num_tags, batch_first=True)

    def construct(self, inputs, seq_length, tags=None):
        embeds = self.embedding(inputs)
        outputs, _ = self.lstm(embeds, seq_length=seq_length)
        feats = self.hidden2tag(outputs)

        crf_outs = self.crf(feats, tags, seq_length)
        return crf_outs

完成模型设计后，我们生成两句例子和对应的标签，并构造词表和标签表。

embedding_dim = 16
hidden_dim = 32

training_data = [(
    "清 华 大 学 坐 落 于 首 都 北 京".split(),
    "B I I I O O O O O B I".split()
), (
    "重 庆 是 一 个 魔 幻 城 市".split(),
    "B I O O O O O O O".split()
)]

word_to_idx = {}
word_to_idx['<pad>'] = 0
for sentence, tags in training_data:
    for word in sentence:
        if word not in word_to_idx:
            word_to_idx[word] = len(word_to_idx)

tag_to_idx = {"B": 0, "I": 1, "O": 2}