文章目录
A survey on domain analysis theory
Domain Adaptation: different distributions, same task;
07 Ben-David 最早的一篇 任务是 Binary class
Preliminary knowledge
A rigorous model of domain adaptation
domain
is not the domain of a function
mean a specific distribution and function pair
以二分类为例 (二分类体现在hypothesis
h
:
X
−
>
0
,
1
h:X->{0,1}
h:X−>0,1)
model
A bound relating the source and target error
Theorem 1
L
1
L_1
L1divergence 不是一个好的选择,因此构建下面的H散度
the H − d i v e r g e n c e H-divergence H−divergence
empirical
H
−
d
i
v
e
r
g
e
n
c
e
H-divergence
H−divergence bewteen two 有限样本 可以成为统计真实
H
−
d
i
v
e
r
g
e
n
c
e
H-divergence
H−divergence的 sup bounds.
这个表明
D
,
D
′
D,D'
D,D′两个样本之间的经验H散度一致收敛到有限VC维的假设类H的真实H散度。
Lemma 2
这里直接给出了计算H散度的公式 首先我们需要找到一个假设,使得该假设对于区分原问题和目标假设问题具有最小误差。
Bounding the difference in error using the H − d i v e r g e n c e H-divergence H−divergence
给出了一个假设h散度做差的一个上界
theorem 2
用定义的新散度正式给出目标域的错误上界
A Reading Feedback of ‘Analysis of Representations for Domain Adaptation’
概述
Abstract
We have labeled training data for a source domain, and we wish to learn a classifier which performs well on a target domain with a different distribution.
Crucial factor of domain adaptation : a good feature representation
Task:
formalize this intuition theoretically with a generalization bound for domain adaption.
- Their theory illustrates the tradeoffs inherent in designing a representation for domain adaptation and gives a new justification for a recently proposed model.
- Also, it promised a new model for domain adaption:one which explicitly minimizes the difference between the source and target domains, while at the same time maximizing the margin of the training set.
引入
challenge: the difference in instance distribution between the source and target domain.
an intuition: to a common representation between the two domains can make the two domains appear to have similar distribution
formalize: bound
The bound is stated in terms of a representation function, and it shows that a representation function should be design to minimize domain divergence, as well as classifier error.
special place on experiment
Their theory applies to the setting in which the target domain has plentiful unlabeled data exists for both target and source domains.
基础知识补充
经验分布函数
经验分布函数是与样本经验测度相关的分布函数。该分布函数是在n个数据点上都跳跃
1
n
\frac{1}{n}
n1的阶级函数。其在测量变量的任何指定值处的值是小于或等于指定值的测量变量的观测值。
经验分布函数是对样本种生成点的累积分布函数的估计。又Glivenko-Cantelli定理,它以概率1收敛到该基础分布
思想:通过样本分布函数来估计总体分布函数
Glivenko-Cantelli Theorem
格利文科定理
每次从总体中随机抽取1个样本,这样抽取很多次后,样本的分布会趋于总体分布。也可以理解为:从总体中抽取容量为n的样本,样本容量越大,样本的分布越趋近于总体分布。
domain 代表定义域, range 代表值域
示性函数
characteristic function
VC dimension
arg
散度
符号说明