论文题目:Adversarial Learning with Contextual Embeddings for Zero-resource Cross-lingual Classification and NER
因为是跨语言,希望Multilingual BERT输入另一个语言就得到对应的embedding:
加入对抗学习:根据上图可知,BERT参数的更新是最大化
p
L
D
p_L^D
pLD,即Non-English的概率,最小化
p
E
n
D
p_{En}^D
pEnD,即English的概率,而Discriminator的参数更新则相反,是最小化
p
L
D
p_L^D
pLD,最大化
p
E
n
D
p_{En}^D
pEnD。
作者结论:
We ob-served that adversarial training moves the embed-dings of English text and their non-English trans-lations closer together.