Whitening transformation(白化变换)

(from: wikipedia)

 

The whitening transformation is a decorrelation method which transforms a set of random variables having the covariance matrix Σ into a set of new random variables whose covariance is aI, where a is a constant and I is the identity matrix. The new random variables are uncorrelated and all have variance 1. The method is called "whitening" because it transforms the input matrix to the form of white noise, which by definition is uncorrelated and has uniform variance. It differs from decorrelation in that the variances are made to be equal, rather than merely making the covariances zero. That is, where decorrelation results in a diagonal covariance matrix, whitening produces a scalar multiple of the identity matrix.

 

Definition

Define X to be a random vector with covariance matrix Σ and mean 0. The matrix Σ can be written as the outer product of X and XT:

\Sigma = \operatorname{E}[XX^T]

 

(注,我初一看以为写错了,按我的想法好像写成 E[XiXj] 更靠谱,我又想了想,后来明白了,E[XiXj]这样写的话,括号里面成了两个变量,而实际应该是一个变量(总体),前后可以是两个变量样本值。所以 E[XXT]这样写是有道理的,只是注意不要误解吧。还可以这样写 E[XYT],X和Y是同分布,但是这是表示两个一般变量协方差的常用方法,显然和这里X是向量的环境不符合。)

 

Define Σ1/2 as

\Sigma^{1/2}(\Sigma^{1/2})^T = \Sigma

Define the new random vector Y = Σ-1/2X. The covariance of Y is

\begin{align}     \operatorname{Cov}(Y)         &= \operatorname{E}[YY^T] \\         &= \operatorname{E}[(\Sigma^{-1/2}X)(\Sigma^{-1/2}X)^T] \\         &= \operatorname{E}[(\Sigma^{-1/2}X)(X^T\Sigma^{-1/2})] \\         &= \Sigma^{-1/2}\operatorname{E}[XX^T]\Sigma^{-1/2} \\         &= \Sigma^{-1/2}\Sigma\Sigma^{-1/2} \\         &= I \end{align}

Thus, Y is a white random vector.


Based on the fact that the covariance matrix is always positive semi-definiteΣ1/2 can be derived using eigenvalue decomposition:

\Sigma\Phi = \Phi\Lambda
\Sigma = \Phi\Lambda\Phi^T
\Sigma^{1/2} = \Phi\Lambda^{1/2}

Where the matrix Λ1/2 is a diagonal matrix with each element being square root of the corresponding element in Λ. To show that this equation follows from the prior one, multiply by the transpose:

\begin{align}     \Sigma^{1/2}(\Sigma^{1/2})^T &= \Phi\Lambda^{1/2}(\Phi\Lambda^{1/2})^T \\     &= \Phi\Lambda^{1/2}(\Lambda^{1/2})^T\Phi^T \\     &= \Phi\Lambda\Phi^T \\     &= \Sigma \\ \end{align}

转载于:https://www.cnblogs.com/kevinGaoblog/archive/2012/06/20/2556335.html

### 白化处理的定义 白化是一种数据预处理技术,旨在使输入数据具有单位方差且不同特征之间相互独立。具体来说,给定一个随机向量 \( \mathbf{x} \),通过某种变换将其转换为新的随机向量 \( \mathbf{y} = W\mathbf{x} \),使得 \( \mathbf{y} \) 的协方差矩阵为单位矩阵 \( I \)[^4]。 这种变换不仅消除了各变量之间的相关性,还调整了它们的尺度,使其标准差均为1。这有助于简化后续分析过程中的计算复杂度,并提高某些机器学习算法的表现效果[^4]。 ### 应用领域 #### 1. 图像处理 在图像去噪和平滑滤波过程中,白化可以去除像素间的冗余信息,增强边缘细节,从而改善视觉质量[^5]。 #### 2. 自然语言处理 词嵌入模型训练前常需对文本数据执行白化操作,以减少词语分布的空间维度差异带来的影响,进而提升语义表示的一致性和准确性[^6]。 #### 3. 神经网络初始化 为了加速深层神经网络收敛速度,在权重参数初始化阶段可利用白化方法确保每一层激活值保持适当范围内的波动幅度,防止梯度消失或爆炸现象发生[^7]。 ```python import numpy as np def whiten(X): """ 对输入矩阵X进行白化处理 参数: X (numpy.ndarray): 输入的数据矩阵, 形状(n_samples, n_features) 返回: Y (numpy.ndarray): 经过白化的数据矩阵, 同样形状(n_samples, n_features) """ # 计算均值和协方差矩阵 mean = np.mean(X, axis=0) cov = np.cov((X - mean).T) # 特征值分解得到D(Λ), V(Q) D, V = np.linalg.eigh(cov) # 构建对角阵Λ^(−1/2) Lambda_inv_half = np.diag(D ** (-0.5)) # 进行白化变换W=QΛ^(−1/2)Q^T W = V @ Lambda_inv_half @ V.T # 输出经过白化后的样本集Y=W*(X-mu) Y = (X - mean) @ W return Y ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值