22-数据增强Style Transfer as Data Augmentation: A Case Study on Named Entity Recognition

数据增强范畴

和Open Relation and Event Type Discovery with Type Abstraction这篇文章,有些类似


background

In the few shot scenarios, increase the size and diversity of the training data.

Recent study:(1)Rule based method (2) the potential of leveraging data from high-resource tasks.

core idea: change the style related attributes of text while its semantics.
the paper formulated the task as paraphrase generation problem.

一、Model

According to the data parallel or not, the paper proposes two way to reslove the question.
For the parallel data, use the paraphrase generation model to brige the gap between the source and target.
For the nonparallel data, use a cycle consistent reconstruction to re-paraphrase back the paraphrased sentences to its original style(有点绕呀)

model structure

paraphrase generation

Two loss function:
L pg:对NER中的label做损失函数,判断模型预测的BIO标签是否正确。
L adv:对抗学习判断input和它的的释义之间的相似度。用Discrimination做的判断。

在这里插入图片描述

cycle-consistent reconstruction

Process

First: generator Gθ to generate the paraphrase y˜cycle of
the input sentence xcycle concatenated with a prefix.
Second: we concatenate the paraphrase y˜cycle
with a different prefix as the input to the generator
Gθ and let it transfer the paraphrase back to the
original sentence yˆcycle

Loss contains two part.
在这里插入图片描述
在这里插入图片描述

在这里插入图片描述

二、数据选择

即使有了有效的结构,生成的句子仍然可能不可靠,因为它可能因为退化的重复和不连贯的胡言乱语而质量不高(Holtzman等人,2020;Welleck等人,2020)。为了缓解这个问题,我们进一步用以下指标进行数据选择。

  • 一致性:来自预训练的 style classifier的confidence score,作为生成句子在target style中的程度。
  • 适当性:由预先训练好的NLU模型对生成的句子保留多少语义进行的信心评分。
  • 流畅性:来自预训练的NLU模型的信心分数,表明生成的句子的流畅性。
  • 多样性:原始句子和生成句子之间在字符层面上的编辑距离。
    对于每个句子,我们过度生成k=10个候选人。我们计算上述指标(详见附录C),并将这些指标的加权分数分配给每个候选人。然后我们用这个分数对所有的候选者进行排名,并选择最好的一个来训练NER系统。

实验

  • 不同的数据增强策略模型的效果。

The source data involves five different domains in the formal style: broadcast conversation (BC), broadcast news (BN), magazine (MZ),newswire (NW), and web data (WB) while the target data involves only social media (SM) domain in the informal style

在这里插入图片描述

  • 不同因素在对实验性能的影响(消融研究)
    在这里插入图片描述

总结

不觉得繁杂吗?相比于提问题做数据增强,这个方法还要分parallel与否,分开建立模型。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
Autoencoder-based data augmentation can have a significant influence on deep learning-based wireless communication systems. By generating additional training data through data augmentation, the performance of deep learning models can be greatly improved. This is particularly important in wireless communication systems, where the availability of large amounts of labeled data is often limited. Autoencoder-based data augmentation techniques can be used to generate synthetic data that is similar to the real-world data. This can help to address the problem of overfitting, where the deep learning model becomes too specialized to the training data and performs poorly on new, unseen data. By increasing the diversity of the training data, the deep learning model is better able to generalize to new data and improve its performance. Furthermore, autoencoder-based data augmentation can also be used to improve the robustness of deep learning models to channel variations and noise. By generating synthetic data that simulates different channel conditions and noise levels, the deep learning model can be trained to be more resilient to these factors. This can result in improved performance in real-world wireless communication scenarios, where channel conditions and noise levels can vary widely. In conclusion, autoencoder-based data augmentation can have a significant influence on deep learning-based wireless communication systems by improving the performance and robustness of deep learning models.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

YingJingh

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值