Partial Distance Correlation

特征函数

随机变量 X 的特征函数定义为:

ϕ X ( t ) ≜ E [ e i t X ] = ∫ − ∞ + ∞ e i t x f ( x ) d x \phi_X(t)\triangleq E[e^{itX}]=\int_{-\infty}^{+\infty}e^{itx}f(x)dx ϕX(t)E[eitX]=+eitxf(x)dx

其中 f X ( x ) f_X(x) fX(x) 为连续随机变量的概率密度函数。

特征函数的一些性质如下:

  • Y = X 1 + X 2 Y = X_1+X_2 Y=X1+X2 X 1 , X 2 X_1,X_2 X1,X2相互独立,则 ϕ Y ( t ) = ϕ X 1 ( t ) ⋅ ϕ X 2 ( t ) \phi_Y(t) = \phi_{X_1}(t)\cdot\phi_{X_2}(t) ϕY(t)=ϕX1(t)ϕX2(t)
  • Y = a X + b Y = aX + b Y=aX+b, 则 ϕ Y ( t ) = e i t b ⋅ ϕ X ( a t ) \phi_Y(t) = e^{itb}\cdot \phi_X(at) ϕY(t)=eitbϕX(at)
  • X ∼ N ( 0 , 1 ) X\sim \mathcal{N}(0,1) XN(0,1) ϕ X ( t ) = e − t 2 / 2 \phi_X(t) = e^{-t^2/2} ϕX(t)=et2/2
联合特征函数

连续随机变量 X , Y X,Y X,Y的联合合同函数定义为

ϕ X , Y ( v , w ) ≜ E [ e i ( v x + w y ) ] = ∫ − ∞ + ∞ ∫ − ∞ + ∞ e i ( v x + w y ) f X Y ( x , y ) d x d y \phi_{X,Y}(v,w) \triangleq E[e^{i(vx + wy)}]= \int_{-\infty}^{+\infty}\int_{-\infty}^{+\infty}e^{i(vx + wy)}f_{XY}(x,y)dxdy ϕX,Y(v,w)E[ei(vx+wy)]=++ei(vx+wy)fXY(x,y)dxdy

性质:

  • ϕ X , Y ( 0 , 0 ) = 1 \phi_{X,Y}(0,0) = 1 ϕX,Y(0,0)=1
  • ϕ X , Y ( v , 0 ) = ϕ X ( v ) , ϕ X , Y ( 0 , w ) = ϕ Y ( w ) \phi_{X,Y}(v, 0) = \phi_X(v),\phi_{X,Y}(0,w) = \phi_Y(w) ϕX,Y(v,0)=ϕX(v),ϕX,Y(0,w)=ϕY(w)
  • ∣ ϕ X , Y ( v , w ) ∣ ≤ 1 |\phi_{X,Y}(v,w)|\le 1 ϕX,Y(v,w)1
  • 当 X 与 Y 独立,则 ϕ X , Y ( v , w ) = ϕ X ( v ) ϕ Y ( w ) \phi_{X,Y}(v,w) = \phi_X(v)\phi_Y(w) ϕX,Y(v,w)=ϕX(v)ϕY(w)
∣ ∣ ⋅ ∣ ∣ w ||\cdot||_w ∣∣w范数

对于 R p × R q R^p\times R^q Rp×Rq 上的复函数 γ ( t , s ) \gamma(t,s) γ(t,s) ∣ ∣ ⋅ ∣ ∣ w ||\cdot||_w ∣∣w范数的定义为:

∣ ∣ γ ( t , s ) ∣ ∣ w 2 = ∫ R p + q ∣ γ ( t , s ) ∣ 2 ω ( t , s ) d t d s ||\gamma(t, s)||^2_w = \int _{R^{p+q}}|\gamma(t,s)|^2\omega(t,s)dtds ∣∣γ(t,s)w2=Rp+qγ(t,s)2ω(t,s)dtds

其中权重函数 ω ( t , s ) \omega(t,s) ω(t,s)为正且积分存在, ∣ γ ( t , s ) ∣ 2 = γ ( t , s ) γ ( t , s ) ‾ |\gamma(t,s)|^2 = \gamma(t,s)\overline{\gamma(t,s)} γ(t,s)2=γ(t,s)γ(t,s)

Distance Correlation

请添加图片描述

任意维度随机变量 X ∈ R p , Y ∈ R q X\in R^p, Y\in R^q XRp,YRq 的 Distance Covariance 定义为:

V 2 ( X , Y ) : = ∣ ∣ ϕ X , Y ( t , s ) − ϕ ( t ) X ϕ Y ( s ) ∣ ∣ 2 = ∫ R p + q ∣ ϕ X , Y ( t , s ) − ϕ X ( t ) ϕ Y ( s ) ∣ 2 w ( t , s ) d t d s , w ( t , s ) = ( ∣ t ∣ p 1 + p ∣ s ∣ q 1 + q ) − 1 \mathcal{V}^2(X,Y) :=||\phi_{X,Y}(t,s) - \phi(t)_X\phi_Y(s)||^2= \int_{R^{p+q}}|\phi_{X,Y}(t,s) - \phi_X(t)\phi_Y(s)|^2w(t,s)dtds, w(t,s) = (|t|_p^{1+p}|s|_q^{1+q})^{-1} V2(X,Y):=∣∣ϕX,Y(t,s)ϕ(t)XϕY(s)2=Rp+qϕX,Y(t,s)ϕX(t)ϕY(s)2w(t,s)dtds,w(t,s)=(tp1+psq1+q)1

随机变量 X ∈ R p X\in R^p XRp Distance Variance 的定义为:

V 2 ( X , X ) = ∣ ∣ ϕ X , X ( t , s ) − ϕ X ( t ) ϕ X ( s ) ∣ ∣ 2 \mathcal{V}^2(X,X) = ||\phi_{X,X}(t,s) - \phi_X(t)\phi_X(s)||^2 V2(X,X)=∣∣ϕX,X(t,s)ϕX(t)ϕX(s)2

随机变量 X ∈ R p , Y ∈ R q X\in R^p, Y\in R^q XRp,YRq 的 Distance Correlation 定义为:

R 2 ( X , Y ) = { V 2 ( X , Y ) V 2 ( X ) V 2 ( Y ) V 2 ( X ) V 2 ( Y ) > 0 0 V 2 ( X ) V 2 ( Y ) = 0 \mathcal{R}^2(X,Y)=\left\{ \begin{array}{rcl} \frac{\mathcal{V}^2(X,Y)}{\sqrt{\mathcal{V}^2(X)\mathcal{V}^2(Y)}} & & {\sqrt{\mathcal{V}^2(X)\mathcal{V}^2(Y)} > 0}\\ 0 & & {\sqrt{\mathcal{V}^2(X)\mathcal{V}^2(Y)} = 0} \end{array} \right. R2(X,Y)={V2(X)V2(Y) V2(X,Y)0V2(X)V2(Y) >0V2(X)V2(Y) =0

V \mathcal{V} V 用于衡量联合特征函数与边缘特征函数乘积的距离 ∣ ∣ ϕ X , Y ( t , s ) − ϕ X ( t ) ϕ Y ( s ) ∣ ∣ ||\phi_{X,Y}(t,s)-\phi_X(t)\phi_Y(s)|| ∣∣ϕX,Y(t,s)ϕX(t)ϕY(s)∣∣,并应用于独立性假设检验:

H 0 : ϕ X , Y ( t , s ) = ϕ X ( t ) ϕ Y ( s ) v s H 1 : ϕ X , Y ( t , s ) ≠ ϕ X ( t ) ϕ Y ( s ) H_0:\phi_{X,Y}(t,s) = \phi_X(t)\phi_Y(s)\quad vs \quad H_1:\phi_{X,Y}(t,s)\ne \phi_X(t)\phi_Y(s) H0:ϕX,Y(t,s)=ϕX(t)ϕY(s)vsH1:ϕX,Y(t,s)=ϕX(t)ϕY(s)

经验 Distance Covariance & Distance Corelation :

( x i , y i ) ∼ ( X , Y ) (x_i,y_i) \sim (X,Y) (xi,yi)(X,Y)

a i j = ∣ ∣ x i − x j ∣ ∣ , a ‾ . j = 1 n ∑ i = 1 n a i j , a ‾ i . = 1 n ∑ j = 1 n a i j , a ‾ . . = 1 n 2 ∑ i = 1 n ∑ j = 1 n a i j a_{ij}=||x_i - x_j||, \overline{a}_{.j} = \frac{1}{n}\sum_{i=1}^na_{ij},\overline{a}_{i.} = \frac{1}{n}\sum_{j=1}^na_{ij},\overline{a}_{..} = \frac{1}{n^2}\sum_{i=1}^n\sum_{j=1}^n a_{ij} aij=∣∣xixj∣∣,a.j=n1i=1naij,ai.=n1j=1naij,a..=n21i=1nj=1naij

A i j = a i j − a ‾ . j − a ‾ i . + a ‾ . . A_{ij} = a_{ij} - \overline{a}_{.j}-\overline{a}_{i.} + \overline{a}_{..} Aij=aija.jai.+a..

V n 2 ( X , Y ) = 1 n 2 ∑ i = 1 n ∑ j = 1 n A i j B i j \mathcal{V}^2_n(X,Y) = \frac{1}{n^2}\sum_{i=1}^n\sum_{j=1}^n A_{ij}B_{ij} Vn2(X,Y)=n21i=1nj=1nAijBij

V n 2 ( X ) = 1 n 2 ∑ i = 1 n ∑ j = 1 n A i j 2 \mathcal{V}^2_n(X) = \frac{1}{n^2}\sum_{i=1}^n\sum_{j=1}^n A_{ij}^2 Vn2(X)=n21i=1nj=1nAij2

R n 2 ( X , Y ) = { V n 2 ( X , Y ) V n 2 ( X ) V n 2 ( Y ) V n 2 ( X ) V n 2 ( Y ) > 0 0 V n 2 ( X ) V n 2 ( Y ) = 0 \mathcal{R}^2_n(X,Y)=\left\{ \begin{array}{rcl} \frac{\mathcal{V}^2_n(X,Y)}{\sqrt{\mathcal{V}^2_n(X)\mathcal{V}^2_n(Y)}} & & {\sqrt{\mathcal{V}^2_n(X)\mathcal{V}^2_n(Y)} > 0}\\ 0 & & {\sqrt{\mathcal{V}^2_n(X)\mathcal{V}^2_n(Y)} = 0} \end{array} \right. Rn2(X,Y)={Vn2(X)Vn2(Y) Vn2(X,Y)0Vn2(X)Vn2(Y) >0Vn2(X)Vn2(Y) =0

距离相关系数的一些性质:

  • E ∣ X ∣ p < ∞ , E ∣ Y ∣ q < ∞ E|X|_p < \infty, E|Y|q < \infty EXp<,EYq<,则 l i m n → ∞ V n ( X , Y ) = V ( X , Y ) lim_{n\rightarrow \infty}\mathcal{V}_n(X,Y)=V(X,Y) limnVn(X,Y)=V(X,Y)

  • E ( ∣ X ∣ p + ∣ Y ∣ q ) < ∞ E(|X|_p + |Y|_q) < \infty E(Xp+Yq)<, 则 lim ⁡ n → ∞ R n 2 ( X , Y ) = ( R ) ( X , Y ) \lim_{n\rightarrow \infty} \mathcal{R}^2_n(X,Y) = \mathcal(R)(X,Y) limnRn2(X,Y)=(R)(X,Y)

  • E ( ∣ X ∣ p + ∣ Y ∣ q ) < ∞ E(|X|_p + |Y|_q) < \infty E(Xp+Yq)<,则 0 ≤ R ≤ 1 0 \le \mathcal{R} \le 1 0R1,当且仅当X与Y独立 R = 0 \mathcal{R}=0 R=0

  • 0 ≤ R n ≤ 1 0\le\mathcal{R}_n\le 1 0Rn1

  • 如果 R ( X , Y ) = 1 \mathcal{R}(X,Y)=1 R(X,Y)=1, 则 ∃ a , b , C \exists a, b, C a,b,C, 满足 Y = a + b X C Y = a + bXC Y=a+bXC

  • V ( X ) = 0 \mathcal{V}(X) = 0 V(X)=0,则 X = E [ X ] X=E[X] X=E[X]

  • V ( a + b C X ) = ∣ b ∣ V ( X ) \mathcal{V}(a+bCX) = |b|\mathcal{V}(X) V(a+bCX)=bV(X)

  • V ( X + y ) ≤ V ( X ) + V ( Y ) \mathcal{V}(X+y)\le\mathcal{V}(X)+\mathcal{V}(Y) V(X+y)V(X)+V(Y)

总体 V 2 ( X , Y ) \mathcal{V}^2(X,Y) V2(X,Y) 的无偏估计:

A = ( a i j ) ∈ R n × n A=(a_{ij})\in R^{n\times n} A=(aij)Rn×n 为对称实数方阵,对角线元素为零,方阵A的 U \mathcal{U} U 中心化矩阵 A ~ \tilde{A} A~ 定义为:

A ~ i , j = { a i j − 1 n − 2 ∑ i = 1 n a i j − 1 n − 2 ∑ j = 1 n a i j + 1 ( n − 1 ) ( n − 2 ) ∑ i = 1 n ∑ j = 1 n a i j i ≠ j 0 i = j \tilde{A}_{i,j}=\left\{ \begin{array}{rcl} a_{ij} - \frac{1}{n-2}\sum_{i=1}^n a_{ij} - \frac{1}{n-2}\sum_{j=1}^n a_{ij} + \frac{1}{(n-1)(n-2)}\sum_{i=1}^n\sum_{j=1}^n a_{ij} & & {i\ne j}\\ 0 & & {i = j} \end{array} \right. A~i,j={aijn21i=1naijn21j=1naij+(n1)(n2)1i=1nj=1naij0i=ji=j

A = ( a i j ) A=(a_{ij}) A=(aij)为随机变量 X 的一批样本 x 1 , . . , x n x_1,..,x_n x1,..,xn 的欧式距离矩阵, B = ( b i j ) B=(b_{ij}) B=(bij) 为随机变量 Y 的样本 y 1 , . . . . , y n y_1,....,y_n y1,....,yn 的欧式距离矩阵,当 E [ X + Y ] < ∞ E[X+Y]<\infty E[X+Y]< n > 3 n > 3 n>3, 则 ( A ~ ⋅ B ~ ) = 1 n ( n − 3 ) ∑ i = 1 n ∑ j = 1 n A ~ i j B ~ i j (\tilde{A}\cdot\tilde{B}) = \frac{1}{n(n-3)}\sum_{i=1}^n\sum_{j=1}^n\tilde{A}_{ij}\tilde{B}_{ij} (A~B~)=n(n3)1i=1nj=1nA~ijB~ij 为总体 V 2 ( X , Y ) \mathcal{V}^2(X,Y) V2(X,Y)的无偏估计量。

Hilbert Space

H n = { A ~ : A ∈ S n } \mathcal{H}_n=\{\tilde{A}:A\in \mathcal{S}_n\} Hn={A~:ASn} 为一个Hilbert space ,其中任意 A = ( A i , j ) , B = ( B i , j ) A=(A_{i,j}),B=(B_{i,j}) A=(Ai,j),B=(Bi,j) 内积运算定义为:

( A ⋅ B ) = 1 n ( n − 3 ) ∑ i ≠ j A i j B i j (A\cdot B) = \frac{1}{n(n-3)}\sum_{i\ne j}A_{ij}B_{ij} (AB)=n(n3)1i=jAijBij

Partial Distance Covariance & Partial Distance Correlation

随机变量 X , Y , Z X,Y,Z X,Y,Z 的样本距离矩阵分别记为 A , B , C A,B,C A,B,C,样本Partial Distance Covariance 的定义为:

p d C o v ( X , Y ; Z ) ≜ ( A ~ ⊥ C ~ ⋅ B ~ ⊥ C ~ ) pdCov(X,Y;Z) \triangleq (\tilde{A}_{\perp \tilde{C}}\cdot \tilde{B}_{\perp \tilde{C}}) pdCov(X,Y;Z)(A~C~B~C~)

其中 A ~ ⊥ C ~ = A ~ − ( A ~ ⋅ C ~ ) ( C ~ ⋅ C ~ ) C ~ , B ~ ⊥ C ~ = B ~ − ( B ~ ⋅ C ~ ) ( C ~ ⋅ C ~ ) C ~ \tilde{A}_{\perp \tilde{C}} = \tilde{A} - \frac{(\tilde{A}\cdot \tilde{C})}{(\tilde{C}\cdot \tilde{C})}\tilde{C},\quad \tilde{B}_{\perp \tilde{C}} = \tilde{B} - \frac{(\tilde{B}\cdot \tilde{C})}{(\tilde{C}\cdot \tilde{C})}\tilde{C} A~C~=A~(C~C~)(A~C~)C~,B~C~=B~(C~C~)(B~C~)C~

样本 Partial Distance Correlation 的定义为:

p d C o r ( X , Y ; Z ) = ( A ~ ⊥ C ~ ⋅ B ~ ⊥ C ~ ) ∣ A ~ ⊥ C ~ ∣ ∣ B ~ ⊥ C ~ ∣ pdCor(X,Y;Z) = \frac{(\tilde{A}_{\perp \tilde{C}}\cdot \tilde{B}_{\perp \tilde{C}})}{|\tilde{A}_{\perp \tilde{C}}||\tilde{B}_{\perp \tilde{C}}|} pdCor(X,Y;Z)=A~C~∣∣B~C~(A~C~B~C~)

  • 26
    点赞
  • 14
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值