对比学习最近挺火的,集中读读相关的论文。先挖个大坑,慢慢填。
核心思想是缩短两个正样本之间的距离,拉大负样本之间的距离。
1. SimCSE
SimCSE: Simple Contrastive Learning of Sentence Embeddings
来源:EMNLP 2021
链接:https://aclanthology.org/2021.emnlp-main.552/
陈丹琦组在EMNLP2021上的新作,真的巧妙,大道至简,什么样的聪明小脑瓜能想出这样绝妙的idea,又简单又鲁棒,实验和分析做得都非常充足。
- 妙点1:用dropout在生成文本的正例
- 妙点2:把自然语言推断任务中的三分类(隐含、中立、矛盾)任务重构,把隐含关系的两句话互为正例,把矛盾关系的两句话互为负例子。
具体的细节写于: 【论文阅读-对比学习】SimCSE Simple Contrastive Learning of Sentence Embeddings
2. ConSERT
ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer
来源:ACL 2021
链接:https://aclanthology.org/2021.acl-long.393/
- 在未标注的数据集上微调预训练模型,达到embeddings在下游任务上的迁移与适配;
- 研究对比学习框架中多种数据增强策略:对抗攻击、Token Shuffling、Cutoff、Dropout,这些策略都是在embedding层做的和之前相比是一大进步
具体的细节写于:【论文阅读-对比学习】ConSERT- A Contrastive Framework for Self-Supervised Sentence Representation Transfer
3. Mixsum
Constructing contrastive samples via summarization for text classification with limited annotations
来源:EMNLP 2021
链接:https://aclanthology.org/2021.findings-emnlp.118/
背景是在标注数据有限的情况下,使用对比学习来改善文本分类任务。
- 用文本摘要来作为数据增强的方式来生成正负样本
- 提出了Mixsum,通过混合原始样本构建新样本
具体的细节写于:【论文阅读-对比学习】Constructing Contrastive Samples via Summarization for Text Classification
- 待读list:
-
Self-Guided Contrastive Learning for BERT Sentence Representations
来源:ACL 2021
链接:https://aclanthology.org/2021.acl-long.197/ -
CLINE: Contrastive Learning with Semantic Negative Examples for Natural Language Understanding
来源:ACL 2021
链接:https://aclanthology.org/2021.acl-long.181/ -
DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations
来源:ACL 2021
链接:https://aclanthology.org/2021.acl-long.72/ -
CIL: Contrastive Instance Learning Framework for Distantly Supervised Relation Extraction
来源:ACL 2021
链接:https://aclanthology.org/2021.acl-long.483/ -
xMoCo: Cross Momentum Contrastive Learning for Open-Domain Question Answering
来源:ACL 2021
链接:https://aclanthology.org/2021.acl-long.477/ -
Modeling Discriminative Representations for Out-of-Domain Detection with Supervised Contrastive Learning
来源:ACL 2021
链接:https://aclanthology.org/2021.acl-short.110/