方差:
V
a
r
(
x
)
=
∑
i
=
1
n
(
x
i
−
μ
)
2
n
方差:Var(x) = \frac{\sum_{i=1}^{n} (x_i - {\mu})^2 }{n}
方差:Var(x)=n∑i=1n(xi−μ)2
设主成分的单位向量为
w
=
(
w
1
,
w
2
)
,
将
x
(
i
)
都减去均值
x
μ
。
设主成分的单位向量为 w=(w1,~w2), 将 x^{(i)} 都减去均值 x_{\mu}。
设主成分的单位向量为w=(w1, w2),将x(i)都减去均值xμ。
由于
x
p
r
o
j
(
i
)
=
∣
∣
x
(
i
)
⃗
∣
∣
⋅
∣
∣
w
⃗
∣
∣
⋅
cos
θ
=
x
(
i
)
⃗
⋅
w
⃗
,即向量的点乘。
由于 x^{(i)}_{proj}=||\vec{x^{(i)} } || \cdot ||\vec{w}|| \cdot \cos \theta = \vec{x^{(i)}} \cdot \vec{w} ,即向量 的点乘。
由于xproj(i)=∣∣x(i)∣∣⋅∣∣w∣∣⋅cosθ=x(i)⋅w,即向量的点乘。
V
a
r
(
x
p
r
o
j
)
=
∑
i
=
1
n
(
x
(
i
)
p
r
o
j
)
2
n
=
∑
i
=
1
n
(
x
(
i
)
⃗
⋅
w
⃗
)
2
n
=
∑
i
=
1
n
(
x
1
(
i
)
⋅
w
1
+
x
2
(
i
)
⋅
w
2
)
2
n
Var(x_{proj}) = \frac{\sum_{i=1}^{n} (x^{(i)_{proj} }) ^2 } {n} = \frac{ \sum_{i=1}^{n} (\vec{x^{(i)}} \cdot \vec{w}) ^ 2}{n} = \frac{ \sum_{i=1}^{n} (x^{(i)}_1\cdot w_1 + x^{(i)} _{2} \cdot w_2) ^2}{n}
Var(xproj)=n∑i=1n(x(i)proj)2=n∑i=1n(x(i)⋅w)2=n∑i=1n(x1(i)⋅w1+x2(i)⋅w2)2
如果原向量
x
⃗
\vec{x}
x 是
m
m
m 维的,
V
a
r
(
x
p
r
o
j
)
=
∑
i
=
1
n
(
x
1
(
i
)
⋅
w
1
+
x
2
(
i
)
⋅
w
2
+
⋯
+
x
m
(
i
)
⋅
w
m
)
2
n
Var(x_{proj}) = \frac{ \sum_{i=1}^{n} (x^{(i)}_1\cdot w_1 + x^{(i)} _{2} \cdot w_2 + \cdots +x^{(i)} _{m} \cdot w_m) ^2}{n}
Var(xproj)=n∑i=1n(x1(i)⋅w1+x2(i)⋅w2+⋯+xm(i)⋅wm)2
所以现在问题转变成,给定一组有
n
n
n 个的
m
m
m 维数据 构成的矩阵
X
m
×
n
X_{m \times n}
Xm×n,求
W
m
×
1
W_{m \times 1}
Wm×1 使得
V
a
r
(
X
)
=
1
n
∑
i
=
1
n
(
∑
j
=
1
m
X
i
,
j
⋅
W
j
T
)
2
最大。
Var(X) = \frac{1}{n} \sum_{i=1}^{n} (\sum_{j=1}^{m} X_{i,j} \cdot W^{T}_{j}) ^2 最大。
Var(X)=n1i=1∑n(j=1∑mXi,j⋅WjT)2最大。
即
V
a
r
(
x
)
=
1
n
⋅
∑
i
=
1
n
(
X
(
i
)
⋅
W
)
2
最大。
Var(x) = \frac{1}{n} \cdot \sum_{i=1}^{n}( X^{(i)} \cdot W) ^2 最大。
Var(x)=n1⋅i=1∑n(X(i)⋅W)2最大。
记
f
(
X
)
=
1
n
⋅
∑
i
=
1
n
(
X
(
i
)
⋅
W
)
2
f(X)=\frac{1}{n} \cdot \sum_{i=1}^{n}( X^{(i)} \cdot W) ^2
f(X)=n1⋅∑i=1n(X(i)⋅W)2
∇
f
=
(
∂
f
∂
W
1
∂
f
∂
W
2
⋯
∂
f
∂
W
m
)
=
2
n
(
∑
i
=
1
n
(
X
1
(
i
)
W
1
+
X
2
(
i
)
W
2
+
…
+
X
m
(
i
)
W
m
)
X
1
(
i
)
∑
i
=
1
n
(
X
1
(
i
)
W
1
+
X
2
(
i
)
W
2
+
…
+
X
m
(
i
)
W
m
)
X
2
(
i
)
⋯
∑
i
=
1
n
(
X
1
(
i
)
W
1
+
X
2
(
i
)
W
2
+
…
+
X
m
(
i
)
W
m
)
X
m
(
i
)
)
\left.\nabla f= \begin{pmatrix}\frac{\partial\mathrm{f}}{\partial\mathrm{W}_1}\\\frac{\partial\mathrm{f}}{\partial\mathrm{W}_2}\\\cdots\\\frac{\partial\mathrm{f}}{\partial\mathrm{W}_m}\end{pmatrix}= \frac{2}{\mathrm{n}}\begin{pmatrix}\sum_{\mathrm{i}=1}^\mathrm{n}\left(\mathrm{X}_1^\mathrm{(i)}W_1+\mathrm{X}_2^\mathrm{(i)}W_2+\ldots+\mathrm{X}_\mathrm{m}^\mathrm{(i)}W_\mathrm{m}\right)\mathrm{X}_1^\mathrm{(i)}\\\sum_{\mathrm{i}=1}^\mathrm{n}\left(\mathrm{X}_1^\mathrm{(i)}W_1+\mathrm{X}_2^\mathrm{(i)}W_2+\ldots+\mathrm{X}_\mathrm{m}^\mathrm{(i)}W_\mathrm{m}\right)\mathrm{X}_2^\mathrm{(i)}\\\cdots\\\sum_{\mathrm{i}=1}^\mathrm{n}\left(\mathrm{X}_1^\mathrm{(i)}W_1+\mathrm{X}_2^\mathrm{(i)}W_2+\ldots+\mathrm{X}_\mathrm{m}^\mathrm{(i)}W_\mathrm{m}\right)\mathrm{X}_\mathrm{m}^\mathrm{(i)}\end{pmatrix}\right.
∇f=
∂W1∂f∂W2∂f⋯∂Wm∂f
=n2
∑i=1n(X1(i)W1+X2(i)W2+…+Xm(i)Wm)X1(i)∑i=1n(X1(i)W1+X2(i)W2+…+Xm(i)Wm)X2(i)⋯∑i=1n(X1(i)W1+X2(i)W2+…+Xm(i)Wm)Xm(i)
即
∇
f
=
2
n
(
X
(
1
)
W
,
…
,
X
(
n
)
W
)
⋅
(
X
1
(
1
)
,
X
2
(
1
)
,
…
,
X
m
(
1
)
X
1
(
2
)
,
X
2
(
2
)
,
…
,
X
m
(
2
)
⋯
X
1
(
n
)
,
X
2
(
n
)
,
…
,
X
m
(
n
)
)
\nabla f= \frac{2}{\mathrm{n}}\left(\mathrm{X}^{(1)}W,\ldots,\mathrm{X}^{(\mathrm{n})}W\right)\cdot\left(\begin{array}{c}\mathrm{X}_{1}^{(1)},\mathrm{X}_{2}^{(1)},\ldots,\mathrm{X}_{\mathrm{m}}^{(1)}\\\mathrm{X}_{1}^{(2)},\mathrm{X}_{2}^{(2)},\ldots,\mathrm{X}_{\mathrm{m}}^{(2)}\\\cdots\\\mathrm{X}_{1}^{(\mathrm{n})},\mathrm{X}_{2}^{(\mathrm{n})},\ldots,\mathrm{X}_{\mathrm{m}}^{(\mathrm{n})}\end{array}\right)
∇f=n2(X(1)W,…,X(n)W)⋅
X1(1),X2(1),…,Xm(1)X1(2),X2(2),…,Xm(2)⋯X1(n),X2(n),…,Xm(n)