【论文阅读】Universal Domain Adaptation

Universal Domain Adaptation

SUMMARY@2020/3/27


Motivation

This paper focuses on its special setting of universal domain adaptation, where

  • no prior information about the target label sets is provided.
  • we know source domain with labeled data.

The following figure shows this motivation of this setting.

and the following show some settings of this universal domain adaptation:

Related Work

This work is partly based on some early works of partially set domain adaptation by Mingsheng Long group, like:

  • SAN (Partial Transfer Learning with Selective Adversarial Networks)
    • utilizes multiple domain discriminators with class level and instance-level weighting mechanism to achieve per-class adversarial distribution matching.
  • PADA (Partial adversarial domain adaptation)
    • only one adversarial network and jointly applying class-level weighting on the source classifier
    • haven’t yet read

and some of others’ relative work of :

  • IWAN (Importance weighted adversarial nets for partial domain adaptation)
    • constructs an auxiliary domain discriminator to quantify the probability of a source sample being similar to the target domain.
    • haven’t yet read

And these works all partly applies the idea of adversarial network GAN and domain adaptation version GAN:

  • GAN (Generative Adversarial Nets)
  • DANN( Domain-Adversarial Training of Neural Networks)
    • adversarial-based, deep method domain adaptation

Challenges / Aims /Contribution

Under the universal domain adaptation setting, our goal now is to match the common categories in source and target domain. The main challenges of solving this universal problem are:

  • how to deal with the C S ˉ \bar {C_S} CSˉ part which is unrelated part of source domain to circumvent negative transfer for target domain

  • effective domain adaptation between related part of source domain and target domain

  • learn model (feature extraction & classifier )to minimize the target risk in the common set C C C

Method Proposed

UAN(universal adaptation network) is composes of 4 parts in training phase as following figure shows.

Feature extractor F F F
  • find good features that match source and target
  • good features to be used by classifier
Label classifier G G G
  • compute prediction label y ^ = G ( F ( x ) ) ∈ C S \hat y = G(F(x)) \in C_S y^=G(F(x))CS (source domain label set)

  • classification loss need to be minimized by good parameters of F F F and G G G
    E G = E ( x , y ) ∼ p L ( y , G ( F ( x ) ) ) E_G = \mathbb E_{(\mathrm{x,y})\sim p}L(\mathrm{y},G(F(\mathrm x))) EG=E(x,y)pL(y,G(F(x)))

Non-adversarial domain discriminator D ′ D^\prime D
  • compute similarity of each x \rm x x to source domain

    • d ^ ′ = D ′ ( z ) ∈ [ 0 , 1 ] \hat d^\prime = D^\prime(\rm z) \in[0,1] d^=D(z)[0,1]
    • $\hat d^\prime \rightarrow 1 $ if x is more similar to source
  • domain classification loss need to be minimized, thus end up with good d ^ ′ \hat d^\prime d^ output for every sample from both source and target domain:
    E D ′ = − E x ∼ p l o g ( D ′ ( F ( x ) ) ) − E x ∼ q l o g ( 1 − D ′ ( F ( x ) ) ) E_{D^\prime} = - \mathbb E_{\mathrm{x}\sim p}\mathrm{log}(D^\prime(F(\mathrm x))) - \mathbb E_{\mathrm{x}\sim q}\mathrm{log}(1- D^\prime(F(\mathrm x))) ED=Explog(D(F(x)))Exqlog(1D(F(x)))

  • hypothesis: expectation of similarity value from different label set distribution will be used in weighting adversarial domain discriminator D:
    E x ∼ p C S ˉ d ^ ′ > E x ∼ p C d ^ ′ > E x ∼ q C d ^ ′ > E x ∼ q C t ˉ d ^ ′ \mathbb E_{\mathrm x\sim {p_{\bar {C_S}}}} {\hat d^\prime} > \mathbb E_{\mathrm x\sim {p_{ {C}}}} {\hat d^\prime} > \mathbb E_{\mathrm x\sim {q_{{C}}}} {\hat d^\prime} > \mathbb E_{\mathrm x\sim {q_{\bar {C_t}}}} {\hat d^\prime} ExpCSˉd^>ExpCd^>ExqCd^>ExqCtˉd^

  • not used in adversarial, since it is the same as in DANN, which aims at matching the exactly same source and target label space. may cause negative transfer in universal setting.

Adversarial domain discriminator D D D
  • aims at discriminate source and target in the common label set C C C

  • domain discriminate loss: needs to be minimized for good discriminator; needs to be maximized which equals the good representation of feature extractor:
    E D = − E x ∼ p w s ( x ) l o g ( D ′ ( F ( x ) ) ) − E x ∼ q w t ( x ) l o g ( 1 − D ′ ( F ( x ) ) ) E_{D} = - \mathbb E_{\mathrm{x}\sim p}w^s(\mathrm x)\mathrm{log}(D^\prime(F(\mathrm x))) - \mathbb E_{\mathrm{x}\sim q}w^t(\mathrm x)\mathrm{log}(1- D^\prime(F(\mathrm x))) ED=Expws(x)log(D(F(x)))Exqwt(x)log(1D(F(x)))

  • add big weights for samples from common label set in both source and target domain , aims at maximally match the source and target domain specially in common label set.

  • weights(called “sample level transferability criterion”) to be constructed:
    E x ∼ p C w s ( x ) > E x ∼ p ˉ C s w s ( x ) E x ∼ q C w t ( x ) > E x ∼ q ˉ C t w t ( x ) \mathbb E_{\mathrm x\sim {p_{{C}}}} w^s(\mathrm x) > \mathbb E_{\mathrm x\sim {\bar p_{{C_s}}}} w^s(\mathrm x) \\ \mathbb E_{\mathrm x\sim {q_{{C}}}} w^t(\mathrm x) > \mathbb E_{\mathrm x\sim {\bar q_{{C_t}}}} w^t(\mathrm x) ExpCws(x)>ExpˉCsws(x)ExqCwt(x)>ExqˉCtwt(x)

  • use entropy of predicted vector to measure uncertainty of prediction:
    E x ∼ q C t ˉ H ( y ^ ) > E x ∼ q C H ( y ^ ) > E x ∼ p C H ( y ^ ) > E x ∼ p C s ˉ H ( y ^ ) \mathbb E_{\mathrm x\sim {q_{\bar {C_t}}}} H(\mathrm {\hat y}) >\mathbb E_{\mathrm x\sim {q_{{C}}}} H(\mathrm {\hat y}) >\mathbb E_{\mathrm x\sim {p_{{C}}}} H(\mathrm {\hat y}) >\mathbb E_{\mathrm x\sim {p_{\bar {C_s}}}} H(\mathrm {\hat y}) ExqCtˉH(y^)>ExqCH(y^)>ExpCH(y^)>ExpCsˉH(y^)

  • use domain similarity and the prediction uncertainty of each sample, to develop a weighting mechanism for discovering label sets shared by both domains and promote common-class adaptation
    w s ( x ) = H ( y ^ ) l o g ∣ C s ∣ − d ^ ′ ( x ) w t ( x ) = d ^ ′ ( x ) − H ( y ^ ) l o g ∣ C s ∣ w^s(\mathrm x) = \frac{H(\mathrm {\hat y})}{\mathrm{log}|C_s|}-\hat d^\prime(\mathrm x) \\ w^t(\mathrm x) = \hat d^\prime(\mathrm x)-\frac{H(\mathrm {\hat y})}{\mathrm{log}|C_s|}\\ ws(x)=logCsH(y^)d^(x)wt(x)=d^(x)logCsH(y^)

    • normalized H
    • all together normalized when training

Training

  • to write in GAN-based two stage, but in neural network implemented end-to-end by using the gradient reversal layer from DANN:

KaTeX parse error: Expected group after '_' at position 6: \max_̲\limits{D}\min_…

Testing

see figure below :

  • no adversarial D D D
  • calculate weight w t ( x ) w^t(x) wt(x)for sample x x x from target
  • set a validated threshold to argue whether x comes from common label set

Experiment

  • F F F is pretrained ResNet50
  • all unknown in target labeled as a whole "unknow " big class
  • better than prior setting methods
### 回答1: 通用域自适应(Universal Domain Adaptation)是指在不同领域之间进行模型迁移的一种方法。这种方法可以将在一个领域中训练好的模型应用到另一个领域中,从而提高模型的泛化能力和适应性。通用域自适应是机器学习领域中的一个热门研究方向,目前已经有很多相关的研究成果。 ### 回答2: 【背景介绍】 随着深度学习与人工智能的发展,将模型从一个领域迁移到另一个领域成为了一个热门话题。而这个过程被称为领域适应(Domain Adaptation, DA)。领域适应通常涉及从源域(source domain)中学习到的知识到新的目标域(target domain)中进行应用。然而,由于目标域与源域的差异可能非常大,要在源域和目标域之间建立有效的迁移学习模型就需要解决一个重要问题:如何在目标域上重新训练模型并使其具有较好的性能。 【问题解决】 要解决这个问题,可以采用“通用领域适应”(Universal Domain Adaptation, UDA)的方法。通用领域适应是一种无监督的领域适应方法,它旨在在没有目标域标签的情况下,将模型从源域到目标域进行迁移。 通用领域适应的主要思路是通过训练一个模型,该模型可以在多个领域之间进行适应。在此情况下,模型的“通用性”使它能够成功地在目标域中进行使用,即使目标域与训练过的源域之间存在非常大的差异。 虽然通用模型和目标模型的差距比较大,但是通过在模型中利用迁移学习的方法来打破这个限制。具体来说,通用领域适应方法使用源域中的数据来训练一个预训练模型,并通过微调来适应目标域。微调可以通过在目标域上进行一些标签信息非常少的监督学习来完成。 【应用场景】 通用领域适应方法在图像分类、自然语言处理和人脸识别等领域中都有广泛的应用。例如,具有通用性的图像分类模型可以应用于不同的任务,例如:车辆和街景分类、遥感图像分类、医学影像分类等多种分类任务。因此,通用领域适应已经成为许多领域中的重要工具。 ### 回答3: 通用领域适应(Universal Domain Adaptation)是机器学习中的一个研究方向,旨在解决多个领域共存时的数据迁移问题。在现实场景中,我们经常需要将已经训练好的模型应用到不同的领域,这就需要对模型进行适应。但是,由于数据的预处理、采集方式不同以及语境、背景等因素的影响,领域间存在很大的差异性,模型直接应用在新领域中不易达到最佳性能。因此,通用领域适应的目标就是能够让模型具有更好的泛化能力,从而适应不同领域的数据分布。 通用领域适应的方法主要包括:基于特征的方法、基于实例的方法、基于模型的方法等。在基于特征的方法中,常用的算法有CORAL、DAN、DANN等。CORAL是一种简单有效的特征转换方法,它利用协方差矩阵对源域和目标域的特征进行转换以使它们更加相似。DAN和DANN则引入了一个领域分类器,通过对源域和目标域特征的分布进行对比,进一步提高了模型性能。 基于实例的方法,如TCA、GFK等,主要是通过将源域和目标域数据进行投影,降低数据维度,从而减小领域间的差异性。基于模型的方法,如ADDA、MCD等,则是引入了领域分类器,使得模型能够识别不同的领域,从而在特征集合中实现对各个领域的区分。 通用领域适应的应用范围广泛,如图像分类、目标检测、文本分类、语音识别等领域。通用领域适应的研究,不仅可以有效提高模型的泛化能力,还能够促进模型在跨领域的数据中的应用,推动机器学习技术的发展。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值