记一次debug-‘mask of the first timestep must all be on’报错

zrz233

已于 2023-11-16 10:44:21 修改

阅读量262

点赞数

分类专栏：问题解决文章标签： bug

于 2023-11-15 19:14:36 首次发布

本文链接：https://blog.csdn.net/zrz233/article/details/134411303

版权

问题解决专栏收录该内容

2 篇文章 0 订阅

订阅专栏

在进行序列标注时，使用crf报错：

‘mask of the first timestep must all be on’

搜索相关问题，给出的大多是将输入参数batch_first设置为True。但是我的代码中本来就是True。因此问题不在这里。
这个报错是指mask的第一个值错误，不应该为0。这里进行debug查看了具体出错位置：

ipdb> n
> <ipython-input-60-103a41bbe73a>(46)crf_neg_log_likelihood()
     44             mask = mask.type(torch.uint8)
     45 
---> 46         crf_llh = self.crf(logits, tags, mask, reduction='mean') # Compute the conditional log likelihood of a sequence of tags given emission scores
     47         # crf_llh = self.crf(logits, tags, mask) # Compute the conditional log likelihood of a sequence of tags given emission scores
     48         return -crf_llh 
    
ipdb> n
ValueError: mask of the first timestep must all be on
> <ipython-input-60-103a41bbe73a>(46)crf_neg_log_likelihood()
     44             mask = mask.type(torch.uint8)
     45 
---> 46         crf_llh = self.crf(logits, tags, mask, reduction='mean') # Compute the conditional log likelihood of a sequence of tags given emission scores
     47         # crf_llh = self.crf(logits, tags, mask) # Compute the conditional log likelihood of a sequence of tags given emission scores
     48         return -crf_llh

说明这里输入的mask的第一位对应值错误，打印出来看看：
在这里插入图片描述

这里的false指是填充padding的内容。第一个tensor中，0表示padding，数字表示每个单词转换成的对应id。但是很明显第一个单词的id：0和用来表示填充的0冲突了，因此导致了这一问题:

修改后就好了：

word2id = {'<pad>': 0, '<unk>': 1}
for sentence in sentences:  # 建立word到索引的映射
    for word in sentence:
        if word not in word2id:
            word2id[word] = len(word2id)
            
print(word2id)
print(len(word2id))

zrz233

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
记一次debug-‘mask of the first timestep must all be on’报错

这里的false指是填充padding的内容。第一个tensor中，0表示padding，数字表示每个单词转换成的对应id。搜索相关问题，给出的大多是将输入参数batch_first设置为True。但是我的代码中本来就是True。这个报错是指mask的第一个值错误，不应该为0。
复制链接

扫一扫