1. Original objective
given coefficient matrix C,
min
A
1
2
∥
∣
C
∣
−
η
A
∥
F
2
s
.
t
.
A
∈
Ω
n
,
d
i
a
g
(
C
)
=
0
(
1
)
\min_\textbf A \frac{1}{2}\||\textbf C|-\eta \textbf A\|_F^2\quad s.t. \ {\bf A\in\Omega}_n, \mathrm {diag}(\textbf C)=0 \quad \quad (1)
Amin21∥∣C∣−ηA∥F2s.t. A∈Ωn,diag(C)=0(1)
where
Ω
n
{\bf \Omega}_n
Ωn is doubly stochastic space. We have:
1
2
∥
∣
C
∣
−
η
A
∥
F
2
=
1
2
∥
C
∥
F
2
+
η
2
2
∥
A
∥
F
2
+
⟨
−
∣
C
∣
,
η
A
⟩
(
2
)
\frac{1}{2}\||\textbf C|-\eta \textbf A\|_F^2= \frac{1}{2}\|\textbf C\|_F^2+ \frac{\eta^2}{2}\|\textbf A\|_F^2 +\langle {\bf -|C|, \eta A} \rangle \quad (2)
21∥∣C∣−ηA∥F2=21∥C∥F2+2η2∥A∥F2+⟨−∣C∣,ηA⟩(2)
Since
C
\bf C
C is fixed in A-DSSC, we are acturally optimizing:
min
A
⟨
−
∣
C
∣
,
A
⟩
+
η
2
∥
A
∥
F
2
s
.
t
.
A
∈
Ω
n
(
3
)
\min_\textbf A \langle {\bf -|C|, A} \rangle + \frac{\eta}{2}\|\textbf A\|_F^2 \quad s.t. \ {\bf A\in\Omega}_n \quad \quad \quad \quad \quad (3)
Amin⟨−∣C∣,A⟩+2η∥A∥F2s.t. A∈Ωn(3)
2. Dual objective
Introducing Lagrange multipliers
α
,
β
∈
R
n
\alpha, \beta \in \mathbb R^n
α,β∈Rn and
A
≥
0
A\geq 0
A≥0 for satisfying the doubly stochastic constraint, then we have a minmax problem:
min
A
≥
0
max
α
,
β
⟨
−
∣
C
∣
,
A
⟩
+
η
2
∥
A
∥
F
2
+
⟨
α
,
A
1
−
1
⟩
+
⟨
β
,
A
⊤
1
−
1
⟩
(
4
)
\min_{\textbf A\geq 0} \max_{\alpha, \beta} \langle {\bf -|C|, A} \rangle + \frac{\eta}{2}\|\textbf A\|_F^2 + \langle \alpha,{\bf A1-1} \rangle + \langle \beta,{\bf A^\top 1-1} \rangle \quad \quad(4)
A≥0minα,βmax⟨−∣C∣,A⟩+2η∥A∥F2+⟨α,A1−1⟩+⟨β,A⊤1−1⟩(4)
内积是拉格朗日法实现矩阵约束的标准表示形式,优化 α \alpha α用于满足行和为1约束,优化 β \beta β用于满足列和为1约束
Note that
⟨
α
,
A
1
−
1
⟩
+
⟨
β
,
A
⊤
1
−
1
⟩
=
⟨
α
1
⊤
+
1
β
⊤
,
A
⟩
−
1
⊤
(
α
+
β
)
\langle \alpha,{\bf A1-1} \rangle + \langle \beta,{\bf A^\top 1-1} \rangle=\langle \alpha \textbf 1^\top + \textbf 1\beta^\top,\textbf A \rangle-\textbf1^\top(\alpha+\beta)
⟨α,A1−1⟩+⟨β,A⊤1−1⟩=⟨α1⊤+1β⊤,A⟩−1⊤(α+β)
P.S. ⟨ α , A 1 ⟩ = t r ( α ⊤ ⋅ A 1 ) = t r ( A 1 ⋅ α ⊤ ) = t r ( A ⋅ 1 α ⊤ ) = t r ( 1 α ⊤ ⋅ A ) = ⟨ α 1 ⊤ , A ⟩ \langle \alpha,{\bf A1}\rangle=tr(\alpha^\top \cdot {\bf A1})=tr({\bf A1} \cdot \alpha^\top)=tr({\bf A \cdot 1}\alpha^\top)=tr({\bf 1}\alpha^\top \cdot \bf A)=\langle \alpha \textbf 1^\top ,\textbf A \rangle ⟨α,A1⟩=tr(α⊤⋅A1)=tr(A1⋅α⊤)=tr(A⋅1α⊤)=tr(1α⊤⋅A)=⟨α1⊤,A⟩
Therefore, strong duality holds by Slater’s condition, so this is equivalent to:
max
α
,
β
−
1
⊤
(
α
+
β
)
+
min
A
≥
0
⟨
−
∣
C
∣
,
A
⟩
+
η
2
∥
A
∥
F
2
+
⟨
α
1
⊤
+
1
β
⊤
,
A
⟩
(
5
)
\max_{\alpha, \beta} -\textbf1^\top(\alpha+\beta) + \min_{\textbf A\geq 0} \langle {\bf -|C|, A} \rangle + \frac{\eta}{2}\|\textbf A\|_F^2 + \langle \alpha \textbf 1^\top + \textbf 1\beta^\top,\textbf A \rangle \quad (5)
α,βmax−1⊤(α+β)+A≥0min⟨−∣C∣,A⟩+2η∥A∥F2+⟨α1⊤+1β⊤,A⟩(5)
3. Search Optimal A and Dual Solution α , β \alpha,\beta α,β
Let K = ∣ C ∣ − α 1 ⊤ − 1 β ⊤ \bf K=|C|-\alpha 1^\top-1\beta^\top K=∣C∣−α1⊤−1β⊤, we have:
⟨ − ∣ C ∣ , A ⟩ + ⟨ α 1 ⊤ + 1 β ⊤ , A ⟩ = ⟨ − ∣ C ∣ + α 1 ⊤ + 1 β ⊤ , A ⟩ = ⟨ − K , A ⟩ \bf \langle -|C|,A\rangle+\langle \alpha \textbf 1^\top + \textbf 1\beta^\top,\textbf A \rangle=\bf \langle -|C|+\alpha \textbf 1^\top + \textbf 1\beta^\top,A \rangle=\langle -K,A\rangle ⟨−∣C∣,A⟩+⟨α1⊤+1β⊤,A⟩=⟨−∣C∣+α1⊤+1β⊤,A⟩=⟨−K,A⟩
Therefore, the inner min \min min term becomes:
η ⋅ min A ≥ n ⟨ − K η , A ⟩ + η 2 ∥ A ∥ F 2 ( 6 ) \eta\cdot{\bf \min_{A\geq n}\langle -\frac{K}{\eta},A\rangle} + \frac{\eta}{2}\|\textbf A\|_F^2 \quad \quad \quad \quad \quad (6) η⋅A≥nmin⟨−ηK,A⟩+2η∥A∥F2(6)
we can complement Eqn.(6) to a F-norm form:
(
6
)
=
−
1
2
η
∥
K
∥
F
2
+
η
min
A
≥
n
1
2
∥
K
η
−
A
∥
F
2
(
7
)
(6)=-\frac{1}{2\eta}\|\textbf K\|_F^2+\eta\min_{A\geq n}\frac{1}{2}\|{\bf \frac{K}{\eta}-A}\|_F^2 \quad (7)
(6)=−2η1∥K∥F2+ηA≥nmin21∥ηK−A∥F2(7)
Apparently,the optimal
A
\bf A
A satisfies
A
=
K
/
η
\bf A=K/\eta
A=K/η,but requires
A
≥
0
\bf A\geq 0
A≥0,therefore,
A
\bf A
A is given as:
A
=
1
η
[
∣
C
∣
−
α
1
⊤
−
1
β
⊤
]
+
{\bf A}=\frac{1}{\eta}{\bf[|C|-\alpha 1^\top-1\beta^\top]_+}
A=η1[∣C∣−α1⊤−1β⊤]+
Therefore, we have
(
7
)
=
−
1
2
η
∥
K
∥
F
2
+
1
2
η
∥
K
−
∥
F
2
=
−
1
2
η
∥
K
+
∥
F
2
(
8
)
(7)=-\frac{1}{2\eta}\|\textbf K\|_F^2+\frac{1}{2\eta}\|\textbf K_-\|_F^2=-\frac{1}{2\eta}\|\textbf K_+\|_F^2 \quad (8)
(7)=−2η1∥K∥F2+2η1∥K−∥F2=−2η1∥K+∥F2(8)
Finally, the version of the dual becomes (See Eqn.5-8):
max
α
,
β
−
1
⊤
(
α
+
β
)
−
1
2
η
∥
K
+
∥
F
2
\max_{\alpha, \beta} -\textbf1^\top(\alpha+\beta)-\frac{1}{2\eta}\|K_+\|_F^2
α,βmax−1⊤(α+β)−2η1∥K+∥F2
i.e.,
max
α
,
β
−
1
⊤
(
α
+
β
)
−
1
2
η
∥
[
∣
C
∣
−
α
1
⊤
−
1
β
⊤
]
+
∥
F
2
\max_{\alpha, \beta} -\textbf1^\top(\alpha+\beta)-\frac{1}{2\eta}\|{\bf[|C|-\alpha 1^\top-1\beta^\top]_+}\|_F^2
α,βmax−1⊤(α+β)−2η1∥[∣C∣−α1⊤−1β⊤]+∥F2
i.e.,
min
α
,
β
1
⊤
(
α
+
β
)
+
1
2
η
∥
[
∣
C
∣
−
α
1
⊤
−
1
β
⊤
]
+
∥
F
2
\min_{\alpha, \beta} \textbf1^\top(\alpha+\beta) + \frac{1}{2\eta}\|{\bf[|C|-\alpha 1^\top-1\beta^\top]_+}\|_F^2
α,βmin1⊤(α+β)+2η1∥[∣C∣−α1⊤−1β⊤]+∥F2