tf.contrib.crf.crf_log_likelihood说明

最新推荐文章于 2023-04-26 09:48:30 发布

王发北

最新推荐文章于 2023-04-26 09:48:30 发布

阅读量4.3k

点赞数 2

分类专栏： Machine Learning Deep Learning tensorflow 文章标签： NER crf_log_likelihood deep learning sequence_lengths

本文链接：https://blog.csdn.net/wwangfabei1989/article/details/88847156

版权

Machine Learning 同时被 3 个专栏收录

56 篇文章 6 订阅

订阅专栏

Deep Learning

41 篇文章 0 订阅

订阅专栏

tensorflow

10 篇文章 0 订阅

订阅专栏

最近在做一个 NER的项目，使用的是BILSTM+CRF 结构，github，求star。

现在对使用 tf.contrib.crf.crf_log_likelihood时，遇到的参数问题说一下：

官方说明：https://www.tensorflow.org/code/stable/tensorflow/contrib/crf/python/ops/crf.py

tf.contrib.crf.crf_log_likelihood(
    inputs,
    tag_indices,
    sequence_lengths,
    transition_params=None
)

Args:

inputs: A [batch_size, max_seq_len, num_tags] tensor of unary potentials to use as input to the CRF layer.
tag_indices: A [batch_size, max_seq_len] matrix of tag indices for which we compute the log-likelihood.
sequence_lengths: A [batch_size] vector of true sequence lengths.
transition_params: A [num_tags, num_tags] transition matrix, if available.

Returns:

log_likelihood: A [batch_size] Tensor containing the log-likelihood of each example, given the sequence of tag indices.
transition_params: A [num_tags, num_tags] transition matrix. This is either provided by the caller or created in this function.

下面只说入参：

inputs: 经过BILSTM层处理后的数据，格式为 [batch_size, max_seq_len, num_tags]

tag_indices: 就是整个项目的入参

sequence_lengths: 该参数是主要说明的，英文直译过来就是：包括实际序列长度，形状为[batch_size] 的向量。下面会详细说的

transition_params:状态转移矩阵

详细说下：sequence_lengths

首先，请记住 sequence_lengths是一个向量,

下面举个例子：

比如：batch_size=4, max_seq_len=5

那么，最终的 sequence_lengths 为[v1，v2,v3,v4] 且 v1<=5,v2<=5,v3<=5,v4<=5，好了，大概格式和数字的范围到现在已经知道了，那么这些 v值，是怎么确认的呢？

在NLP中有很多句子大于max_seq_len，或者小于max_seq_len。对于大于max_seq_len的句子直接截取为长度为max_seq_len的句子即可，在截取后的句子中的每一个词都是有效的。但是对于小于max_seq_len的句子，此时就需要 padding了，padding的词都是无意义的，只是为了形成进入NN的结构。所以此处v的值就是记录该句子未padding前的真实的长度。明白了吧。

到此，实际上你已经可以正常使用这个 API了，但是，如果你还要问，为什么是这中格式呢？那咱们继续看源码：

打开上面 github的地址，

crf_log_likelihood->crf_sequence_score->crf_unary_score 找到此方法的 293行

  masks = array_ops.sequence_mask(sequence_lengths,
                                  maxlen=array_ops.shape(tag_indices)[1],
                                  dtype=dtypes.float32)

然后找到sequence_mask源码：https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/array_ops.py

3063行 sequence_mask:

以上面的例子来说该方法，该方法返回一个 sequence_lengths * max_seq_len 的矩阵，也即 4*5 的mask矩阵，该矩阵用来计算后续损失时，将无效词和tag 去除。这里面的值都是如何形成的呢，Aij，i=0,1,2,3,j=0,1,2,3,4 其中i为 sequence_length索引,j=range(max_seq_len)

Aij=true if j<sequence_length[i] else false

下面是例子，明白了吧

  tf.sequence_mask([1, 3, 2], 5)  # [[True, False, False, False, False],
                                  #  [True, True, True, False, False],
                                  #  [True, True, False, False, False]]

知乎： https://zhuanlan.zhihu.com/albertwang

微信公众号：AI-Research-Studio

下面是赞赏码

王发北

关注

2
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
tf.contrib.crf.crf_log_likelihood说明

最近在做一个 NER的项目，使用的是BILSTM+CRF 结构，github，求star。现在对使用tf.contrib.crf.crf_log_likelihood时，遇到的参数问题说一下：官方说明：https://www.tensorflow.org/code/stable/tensorflow/contrib/crf/python/ops/crf.pytf.contr...
复制链接

扫一扫