Domain Adaptation via TransferComponent Analysis个人笔记

umbrellazg

已于 2023-10-20 21:04:16 修改

阅读量83

点赞数

文章标签：笔记

于 2023-10-19 15:46:48 首次发布

本文链接：https://blog.csdn.net/m0_51576139/article/details/133912897

版权

abstract：

本文提出通过一种新的学习方法，转移成分分析（TCA）来找到领域的良好特征表示，用于领域自适应。TCA学习所有域的公共迁移成分(即不会引起域间分布变化及保持原始数据固有结构的成分), 使得不同域在投影后的子空间中分布差异减少.

I. INTRODUCTION：

内容：

Our main contribution is on proposing a novel dimensionality reduction method to reduce the distance between domains via projecting data onto a learned transfer subspace.

TCA and its semisupervised extension SSTCA are much more efficient than MMDE and can handle the outof-sample extension problem.

摘抄句：

This is an important learning problem because labeled data are often difficult to come by, making it desirable to make the best use of any related data available. For example.....

A major computational problem in domain adaptation is how to reduce the difference between the distributions of the source and target domain data. Intuitively, discovering a good feature representation across domains is crucial...

In this paper, we propose a new feature extraction approach, called transfer component analysis (TCA), for domain adaptation.例举前人的工作和它们的缺陷后提出本文自己的方法。

More specifically, if two domains are related to each other, there may exist several common
components (or latent variables) underlying them.

Our main contribution is on proposing a novel dimensionality reduction method to reduce the distance between domains via projecting data onto a learned transfer subspace.

II. PREVIOUS WORKS AND PRELIMINARIES

内容：In Section II, we first introduce the domain adaptation problem and traditional dimensionality reduction methods and describe the Hilbert space embedding for distances and dependence measure between distributions

A. Domain Adaptation

The main difference between these methods and our proposed method is that we aim to match data distributions between domains in a latent space, where data properties can be preserved, instead of matching them in the original feature space.

摘抄：The key assumption in most domain adaptation methods is that P≠Q,
but P(Ys|Xs) = P(Yt|Xt)

B. Hilbert Space Embedding of Distributions

在2006年, Borgwardt等人[1]提出了一种基于再生核希尔伯特空间(Reproducing Kernel Hilbert Space, RKHS)的分布度量准则 最大均值差异(Maximum Mean Discrepancy, MMD).

H is the RKHS norm.是核映射函数，在目标域里存在有标签数据Xt的情况下通过最小化上式，可以获得最佳函数。

C. Embedding Using HSIC

III. TCA

摘抄：

As mentioned in Section II-A, most domain adaptation methods assume that P ≠ Q, but P(Ys|Xs) = P(Yt|Xt),However, in many real-world applications, the conditional probability P(Y|X) may also change across domains due to noisy or dynamic factors underlying the observed data.