Understanding of Hilbert-Schmidt Independence Criterion (HSIC)

最新推荐文章于 2024-11-15 13:56:13 发布

陈煜嵘Yurong

最新推荐文章于 2024-11-15 13:56:13 发布

阅读量1.1k

点赞数 2

本文链接：https://blog.csdn.net/weixin_43120238/article/details/118066356

版权

Problem
Given $(x_1, y_1), …, (x_n, y_n) \in P(x, y),$
determing whether $\times P(y).$
Or measure degree of dependence.
Below is the formal problem setting:
Let Pxy be a Borel probability measure defined on a domain X x Y, and let Px and Py be the respective marginal distributions on X and Y. Given an I.I.D sample Z := (X, Y) = {(x1, y1), …, (xm, ym)} of size m drawn independently and identically distributed according to Pxy, does Pxy factorize as PxPy.
Applications
- Independent component analysis;
- Dimsionality reduction and feature extraction;
- Statistical modeling.
Indirect Approach
- Perform density estimate of P(x, y)
- Check whether the estimate approximately factorizes
Direct Approach
- Check properties of factorizing distributions
- e.g. kurtosis, covariance operators.

HSIC is defined as the squared Hilbert-Schmidt (HS) norm of the associated cross-covariance operator Cxy:
${\rm HSIC}(P_{xy}, F, G) = || C_{xy} || ^2 _{HS}.$
Above definition of HSIC involves two other definitions. One is cross-covariance operator which is defined:
$C_{xy} := {\rm E}_{xy}([f(x) - {\rm E}_{x}(f(x))] [g(y) - {\rm E}_{y}(g(y))]).$
It is obvious that Cxy is a linear opertor that maps from G to F. The operator itself can be writen in: $C_{xy} := {\rm E}_{xy}[(f(x) - \mu_x) \otimes (g(y) - \mu_y)],$
where it denotes the tensor product.