Word Embedding Preparation 5. BERT

最新推荐文章于 2021-06-25 17:02:20 发布

FocusYang55

最新推荐文章于 2021-06-25 17:02:20 发布

阅读量98

点赞数

分类专栏： nlp

本文链接：https://blog.csdn.net/boosting1/article/details/111306591

版权

5 篇文章 0 订阅

订阅专栏

BERT

Published by Google in 2018
Bidirectional Encoder Representation from Transformers
Two Phrases: Pre-training, fine-turning
Use Transformer proposed in Attention Is All You Need by Google 2017 to replace RNN
BERT takes advantages of multiple models. （1）predict word from given contect -Word2Vec CBOW,(2) 2-layer bidirectional model - ElMo, (3) Transformer instead of RNN-GPT（Generative Pre-trainning）

BERT uses Language Model to train model. inspired by the Cloze Task.

Mask 15% words of doc(for dropout to prevent weights over focus on some )

Input = [CLS] the man went to [MASK] store [SEP] he bought a gallon [MASK] milk [SEP]

Label = IsNext

Input = [CLS] the man [MASK] to the store [SEP] penguin [MASK] are fligh ## less birds [SEP]

Label = NotNext

关注

专栏目录