论文：Crowdsourcing Learning as Domain Adaptation: A Case Study on Named Entity Recognition

最新推荐文章于 2024-08-13 13:20:29 发布

YingJingh

最新推荐文章于 2024-08-13 13:20:29 发布

阅读量187

点赞数

分类专栏：论文记录文章标签：自然语言处理深度学习人工智能

本文链接：https://blog.csdn.net/Hekena/article/details/126713563

版权

论文记录专栏收录该内容

147 篇文章 9 订阅

订阅专栏

Crowdsourcing Learning as Domain Adaptation: A Case Study on Named Entity Recognition

众包形成的NER数据集可以看做领域适用问题，每个标注者视为一个领域，问题转为了多个领域的适应问题。

看法：论文的想法比较新，可以将众包和domain adaptation问题结合起来。
但又一点，和实际关联不大，不可能每条句子都有很多人来标注，文章在Conll03数据集上做测试时，是大约47个人标注一个句子。

前言

在众包学习模型中，减少众包中的噪声问题的方式：（1）majority voting
（2）减小众包数据和gold standard annotation（一般指专家标注的数据集）之间的距离。

文章思路

我们将每个注释器具体视为一个域，然后众包学习本质上几乎是一个多源域适应问题（We treat each annotator as one domain specifically, and then crowdsourcing learning is essentially almost a multi-source domain adaptation problem.）。
设定情景两种：一是有监督下的众包学习-领域适应问题（存在expert annotation的情况下）；二是无监督下的众包学习（无expert annotation的情况下）

文章模型

PGN是每个domain adaptor产生的参数。

Parameter Generation Network (PGN) (Platanios et al., 2018; Jiaet al., 2019) to produce adapter parameters dynamically by input annotators.

adapater是可学习的参数
transformer参数固定了
在这里插入图片描述

模型部分

- Word Representation——Adapter◦BERT (Houlsby et al., 2019), where two extra adapter modules are inside each transformer layer.
x = w1 · · · wn 在这里插入图片描述
Annotator Switcher
关键思想是使用参数生成网络 (PGN) (Platanios et al., 2018; Jia et al., 2019) 通过输入注释器动态生成适配器参数。PGN 模块将根据注释器输入动态为适配器生成 V ◦
在得到x之后，通过BiLSTM进一步编码。
之后是通过CRF得到label表示。

专家表示 $e^{(expert)}$

(1)有监督条件下，直接使用模型学习得到.
(2)无监督条件下，利用每个annotator的embedding的中心点估计得到。
在这里插入图片描述

YingJingh

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
论文：Crowdsourcing Learning as Domain Adaptation: A Case Study on Named Entity Recognition

在众包学习模型中，减少众包中的噪声问题的方式：（1）majority voting（2）减小众包数据和gold standard annotation（一般指专家标注的数据集）之间的距离。文章思路。
复制链接

扫一扫