sinkhorn algorithm

最新推荐文章于 2024-05-30 12:10:29 发布

写程序超快乐的

最新推荐文章于 2024-05-30 12:10:29 发布

阅读量3.6k

点赞数 3

分类专栏： matrix scaling

本文链接：https://blog.csdn.net/qq_41180336/article/details/112347266

版权

matrix scaling 专栏收录该内容

1 篇文章 1 订阅

订阅专栏

abstract

sinkhorn knopp algorithm compute $D_1$ and $D_2$ iteratively normalizing all rows and all columns in $A$ .

根据一个非负矩阵 $A$ 得到一个双重随机矩阵double stochastic matrix。通过对矩阵 $A$ 进行交替缩放 $A$ 的行和列，使用两个主对角线正矩阵 $D_1$ 、 $D_2$ 使得
$B=D_1AD_2$
其中 $B$ 是双重随机矩阵（行和与列和都为1）.
这样的矩阵序列收敛到一个双重随机矩阵的充要条件是矩阵 $A$ 至少包含以一个正对角线。

存在具有正主对角线的对角矩阵 $D_1$ 和 $D_2$ 使得 $D_1AD_2$ 是双重随机矩阵并且有限迭代 $\leftrightarrow$ $A\neq 0$ 且A的每个正项都包含在一个正对角线。
$D_1AD_2$ 是唯一的，并且 $D_1$ 和 $D_2$ 是唯一的 $\leftrightarrow$ $A$ 是完全 indecomposable。

definitions

$A$ 是一个非负方阵， $\sigma$ 是一个排列 ${1,\dots, N}$ ，那么以元素 $a_{1,\sigma (1)},\dots,a_{N,\sigma (N)}$ 的序列称为the diagonal of $A$ corresponding to $\sigma$ . If $\sigma$ is the identify，那么这条对角线称为主对角线。
一个非负矩阵有一条正对角线，就称为这个矩阵have total support.
$A$ is fully indecomposable if it is impossible to find permutation matrices $P$ and $Q$ so that
$\left\{ \begin{matrix} A_1 & 0 \\ A_2 & A_3 \end{matrix} \right\}$

theorem

$A$ 是一个非负的 $N\times N$ 矩阵。

存在一个双重随机矩阵 $B$ （形如 $D_1AD_2$ ），其中 $D_1$ 和 $D_2$ 是对角矩阵且是正主对角 $\leftrightarrow$ $A$ has totsl support.
如果 $B$ 存在那么就是唯一的。同样 $D_1$ 和 $D_2$ 是唯一的对于a scalar multiple 当且仅当 $A$ is fully indecomposable.
交替归一化 $A$ 的行列将收敛于一个双重随机极限 $\leftrightarrow$ A has support. 如果A has total support,这个极限可以被描述为 $D_1AD_2$ 。如果A has support which is not total,那么这个极限就不能描述为 $D_1AD_2$ 。
该算法将以等于 $B$ 的奇异值的平方的渐近速率线性收敛the algorithm will converge linearly with asymptotic rate equal to the square of the subdominant singular value of B

algorithm

假设有向量 $r$ ,则 $D (r)$ 表示以 $r$ 为主对角线元素的对角矩阵。
令 $D_1=D(r),D_2=D(c),e=ones(N,1),A\in R^{N\times N}$ ,则有
$D(r)AD(c)e=D(r)Ac=D(Ac)r=e\\ D(c)A^TD(r)e=D(c)A^Tr=D(A^Tr)c=e$
当 $A$ 是对程矩阵，则有 $r = c = z$ ,满足 $D (A z) z = D (z) A z = e$ ,SK算法表示为
$c_{k+1}=D^{-1}(A^Tr_k)e\\r_{k+1}=D^{-1}(Ac_{k+1})e$
在对称的情况下 $z_{k+1}=D^{-1}(Az_k)e$ .

drawback

所有这些迭代算法的共同缺点是即使在看似简单的情况下也表现出缓慢的收敛行为。
$c_{k+1}=D^{-1}(A^Tr_k)e=D^{-1}(A^TD^{-1}(Ac_k)e)e, k>=0$
按元素写成
$(c_{k+1})_s=(\sum^n_{m=1}a_{m,s}(\sum^n_{l=1}a_{m,l}(c_k)_l)^{-1})^{-1}$
这等价于一个定点迭代
$c_{k+1}=T(c_k), T(x)_s=(\sum^n_{m=1}a_{m,s}(\sum^n_{l=1}a_{m,l}x_l)^{-1})^{-1}$
也就是解决一个关于非线性计算 $T$ 的迭代问题
$x = T (x), x > = 0$