A Closer Look At Feature Space Data Augmentation For Few-Shot Intent Classification
PDF:https://aclanthology.org/D19-6101.pdf
2019 ACL
首次将Feature space Data Augmentation(upsample,perturb,extra,linear,DELTA,CAVE)与微调结合。
Diversity Features Enhanced Prototypical Network for Few-Shot Intent Detection
PDF:https://www.ijcai.org/proceedings/2022/0617.pdf
2021 IJCAI
DFEPN通过多样性生成器模块在隐层空间生成了样本的多样性特征,并将它们的特征和和原始的支持集向量混合来得到每个类更合适的原型向量。
MEDA: Meta-Learning with Data Augmentation for Few-Shot Text Classification
PDF:https://www.ijcai.org/proceedings/2021/541
2021 IJCAI
MEDA由两个模块组成,一个是球生成器,另一个是元学习器,这两个模块是联合训练的。球生成器通过生成更多的样本来增加每个类别的样本数,从而可以使用原始样本和增强样本来训练元学习器。
PROTAUGMENT:Unsupervised diverse short-texts paraphrasing for intent detection meta-learning
PDF:https://arxiv.org/pdf/2105.12995.pdf
2021
PROTAUGMENT是半监督学习方法。我们首先从未标记的数据中进行数据增强,得到每个未标记句子的M个释义。 x x x 的第 m m m 个释义记为 x ~ m \tilde{x}^m x~m。然后,给定未标记的数据及其释义,计算一个完全无监督的损失。最后,我们结合有监督损失 L ˉ \bar{L} Lˉ(使用标记数据的原型网络损失)和无监督损失(表示 L ~ \tilde{L} L~),并运行反向传播来更新模型的参数。
其他数据增强论文:
STraTA: Self-Training with Task Augmentation for Better Few-shot Learning
2021 EMNLP
Dynamic Augmentation Data Selection for Few-shot Text Classification
2022 EMNLP
Unsupervised Data Augmentation with Naive Augmentation and without Unlabeled Data
2021 EMNLP
MELM: Data Augmentation with Masked Entity Language Modeling for Low-Resource NER
2022 ACL
PromDA: Prompt-based Data Augmentation for Low-Resource NLU Tasks
2022 ACL