Autoencoding/BERT在词表征中的优势和劣势

最新推荐文章于 2023-04-13 11:32:14 发布

安達と島村

最新推荐文章于 2023-04-13 11:32:14 发布

阅读量1k

点赞数

分类专栏：机器学习

本文链接：https://blog.csdn.net/weixin_43292547/article/details/109993790

版权

机器学习专栏收录该内容

40 篇文章 0 订阅

订阅专栏

词表征：使用一些特征对单词进行分类，使含义相似的词语具有相似的词向量。
在这里插入图片描述

优点：它能够获得上下文的双向特征表示（BERT最大的亮点），同时它可以更自然地整合到双向语言模型中。
Advantages: It is able to obtain context-sensitive two-way feature representation (The biggest highlight of BERT）and at the same time it can be more naturally integrated into the two way language model.

缺点：在训练输入处引入[mask]标签会导致预训练阶段和微调阶段的不一致。预测词之间的关联性没有被考虑。
Disadvantages: The introduction of mask tags at the training input leads to inconsistencies between the pretraining stage and the fine-tuning stage. The correlation between the predicted word is not considered.

BERT: BERT采用了Transformer Encoder block进行连接，是一个典型的双向编码模型。

在这里插入图片描述

BERT 的特点：a 引入Masked LM(带mask的语言模型训练)

a.1 在原始训练文本中，随机的抽取15%的token作为即将参与mask的对象。
a.2 在这些被选中的token中，数据⽣生成器器并不不是把他们全部变成[MASK]，⽽而是有下列列3个选择:
a.2.1 在80%的概率下，用[MASK]标记替换该token, 比如my dog is hairy -> my dog is [MASK]
a.2.2 在10%的概率下, ⽤⼀个随机的单词替换该token, 比如my dog is hairy -> my dog is apple
a.2.3 在10%的概率下, 保持该token不变, 比如my dog is hairy -> my dog is hairy
…
b 引入Next Sentence Prediction (下⼀句话的预测任务)
b.1 目的是为了服务问答，推理，句⼦主题关系等NLP任务。
b.2 所有的参与任务训练的语句都被选中参加。
·50%的B是原始⽂本中实际跟随A的下⼀句话。(标记为IsNext，代表正样本)
·50%的B是原始⽂本中随机抽取的⼀句话。(标记为NotNext，代表负样本)
b.3 在该任务中，Bert模型可以在测试集上取得97-98%的准确率。

安達と島村

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Autoencoding/BERT在词表征中的优势和劣势

import lombok.*;import org.junit.Test;import java.util.ArrayList;import java.util.Optional;import java.util.stream.Collectors;import java.util.stream.Stream;public class T1 { @Test public void asdasd(){ ArrayList<Integer> l=ne
复制链接

扫一扫