设对某一事物的研究涉及p个指标,分别用X1,X2,…,Xp表示,这p个指标构成的p维随机向量(或者说矩阵)为X=(X1, X2, …, Xp),一般来说,当数据的指标较多或者维数较多时,会带来分析问题的复杂性。此时能否用某种方法将多个指标转化为几个综合指标,以便达到降维的目的,更简便地分析数据,这种方法就是主成分分析。通常把转化生成的综合指标称为主成分,其中每个主成分都是原始变量的线性组合,且每个成分之间互不相关,使得主成分比原始变量具有某些更优越的性能。
| X11 |X12 | …| X1p
|–|--|-- |–|
|X21 |X22 |…|X2p
| X31 | X32|…|X3p
|…|…|…|…|
| Xm1| Xm2 | …|Xmp
原始p维数据表示为
X
=
[
x
11
x
12
⋯
x
1
p
x
21
x
22
⋯
x
2
p
⋮
⋮
⋱
⋮
x
p
1
x
p
2
⋯
x
p
p
]
X=\begin {bmatrix} x_{11} & x_{12} & \cdots & x_{1p}\\ x_{21} & x_{22} & \cdots & x_{2p}\\ \vdots&\vdots&\ddots & \vdots\\ x_{p1} & x_{p2} & \cdots & x_{pp}\\ \end{bmatrix}
X=⎣⎢⎢⎢⎡x11x21⋮xp1x12x22⋮xp2⋯⋯⋱⋯x1px2p⋮xpp⎦⎥⎥⎥⎤
如上表所示
设随机向量X(矩阵)的均值为
μ
\mu
μ,协方差矩阵为
Σ
\Sigma
Σ,下面给出μ与Σ的求法:
μ
\mu
μ为矩阵各列向量的平均值,设
μ
\mu
μ=[
μ
1
\mu_1
μ1,
μ
2
\mu_2
μ2, …,
μ
p
\mu_p
μp],举例:
μ
1
\mu_1
μ1=
1
m
∑
i
=
0
m
X
1
i
\frac{1}{m}\sum_{i=0}^m X_{1i}
m1∑i=0mX1i,其他分量的平均值与其类似
Σ
\Sigma
Σ的求法:
1.计算协方差矩阵
\\
1.1先将X矩阵标准化,即每列元素都减去所在列向量的平均值
μ
i
\mu_i
μi,得到
(
x
11
−
μ
1
x
12
−
μ
2
⋯
x
1
p
−
μ
p
x
21
−
μ
1
x
22
−
μ
2
⋯
x
2
p
−
μ
p
⋮
⋮
⋱
⋮
x
p
1
−
μ
1
x
p
2
−
μ
2
⋯
x
p
p
−
μ
p
)
=
(
a
1
a
2
⋯
a
p
)
\begin{pmatrix} x_{11}-\mu_1& x_{12}-\mu_2 &\cdots& x_{1p}-\mu_p \\ x_{21}-\mu_1 & x_{22}-\mu_2& \cdots &x_{2p}-\mu_p \\ \vdots&\vdots&\ddots&\vdots\\ x_{p1}-\mu_1 & x_{p2}-\mu_2& \cdots&x_{pp}-\mu_p \\ \end{pmatrix}=\begin{pmatrix} a_1&a_2&\cdots&a_p \end{pmatrix}
⎝⎜⎜⎜⎛x11−μ1x21−μ1⋮xp1−μ1x12−μ2x22−μ2⋮xp2−μ2⋯⋯⋱⋯x1p−μpx2p−μp⋮xpp−μp⎠⎟⎟⎟⎞=(a1a2⋯ap)
1.2设这个矩阵为A,则
Σ
\Sigma
Σ=
1
m
−
1
\frac {1}{m-1}
m−11A’*A,得到
Σ
\Sigma
Σ如下图
Σ
=
[
σ
11
σ
12
⋯
σ
1
p
σ
21
σ
22
⋯
σ
2
p
⋮
⋮
⋱
⋮
σ
p
1
σ
p
2
⋯
σ
p
p
]
\Sigma=\begin{bmatrix} \sigma_{11}& \sigma_{12} &\cdots& \sigma_{1p} \\ \sigma_{21}& \sigma_{22} &\cdots& \sigma_{2p} \\ \vdots&\vdots&\ddots&\vdots\\ \sigma_{p1}& \sigma_{p2} &\cdots& \sigma_{pp} \\ \end{bmatrix}
Σ=⎣⎢⎢⎢⎡σ11σ21⋮σp1σ12σ22⋮σp2⋯⋯⋱⋯σ1pσ2p⋮σpp⎦⎥⎥⎥⎤
1.3在此引出矩阵B,其表达式为
B
=
(
X
11
−
μ
1
D
1
X
12
−
μ
2
D
2
⋯
X
1
p
−
μ
p
D
p
X
21
−
μ
1
D
1
X
22
−
μ
2
D
2
⋯
X
2
p
−
μ
p
D
p
⋮
⋮
⋱
⋮
X
p
1
−
μ
1
D
1
X
p
2
−
μ
2
D
2
⋯
X
p
p
−
μ
p
D
p
)
=
(
b
1
b
2
⋯
b
p
)
B=\begin{pmatrix} \frac{X_{11}-\mu_1}{{\sqrt {D_1}}} & \frac{X_{12}-\mu_2}{{\sqrt {D_2}}} &\cdots& \frac{X_{1p}-\mu_p}{{\sqrt {D_p}}} \\ \frac{X_{21}-\mu_1}{{\sqrt {D_1}}} & \frac{X_{22}-\mu_2}{{\sqrt {D_2}}} &\cdots& \frac{X_{2p}-\mu_p}{{\sqrt {D_p}}} \\ \vdots&\vdots&\ddots&\vdots\\ \frac{X_{p1}-\mu_1}{{\sqrt {D_1}}} & \frac{X_{p2}-\mu_2}{{\sqrt {D_2}}} &\cdots& \frac{X_{pp}-\mu_p}{{\sqrt {D_p}}} \\ \end{pmatrix}=\begin{pmatrix} b_1&b_2&\cdots&b_p \end{pmatrix}
B=⎝⎜⎜⎜⎜⎜⎜⎛D1X11−μ1D1X21−μ1⋮D1Xp1−μ1D2X12−μ2D2X22−μ2⋮D2Xp2−μ2⋯⋯⋱⋯DpX1p−μpDpX2p−μp⋮DpXpp−μp⎠⎟⎟⎟⎟⎟⎟⎞=(b1b2⋯bp)
2.接下来计算相关矩阵R
\\
2.1令
d
=
(
D
1
D
2
)
⋯
D
p
)
d=\begin{pmatrix} \sqrt {D_1}& \sqrt {D_2})&\cdots&\sqrt{D_p}\\ \end{pmatrix}
d=(D1D2)⋯Dp),
D
i
D_i
Di表示原始矩阵每一列向量的方差,
D
=
d
′
∗
d
D=d'*d
D=d′∗d,则D的表达式为
D
=
(
D
1
D
1
D
1
D
2
⋯
D
1
D
p
D
2
D
1
D
2
D
2
⋯
D
1
D
p
⋮
⋮
⋱
⋮
D
p
D
1
D
p
D
2
⋯
D
p
D
p
)
D= \begin{pmatrix} \sqrt {D_{1}D_{1}}& \sqrt {D_{1}D_{2}}&\cdots&\sqrt {D_{1}D_{p}}\\ \sqrt {D_{2}D_{1}}& \sqrt {D_{2}D_{2}}&\cdots&\sqrt {D_{1}D_{p}}\\ \vdots&\vdots&\ddots&\vdots\\ \sqrt {D_{p}D_{1}}& \sqrt {D_{p}D_{2}}&\cdots&\sqrt {D_{p}D_{p}} \\ \end{pmatrix}
D=⎝⎜⎜⎜⎛D1D1D2D1⋮DpD1D1D2D2D2⋮DpD2⋯⋯⋱⋯D1DpD1Dp⋮DpDp⎠⎟⎟⎟⎞
2.2则相关矩阵R=
Σ
\Sigma
Σ./D
R
=
[
σ
11
D
1
D
1
σ
11
D
1
D
1
⋯
σ
11
D
1
D
1
σ
11
D
1
D
1
σ
11
D
1
D
1
⋯
σ
11
D
1
D
1
⋮
⋮
⋱
⋮
σ
11
D
1
D
1
σ
11
D
1
D
1
⋯
σ
11
D
1
D
1
]
R=\begin{bmatrix} \frac{\sigma_{11}}{\sqrt {D_{1}D_{1}}}& \frac{\sigma_{11}}{\sqrt {D_{1}D_{1}}}&\cdots&\frac{\sigma_{11}}{\sqrt {D_{1}D_{1}}}\\ \frac{\sigma_{11}}{\sqrt {D_{1}D_{1}}}& \frac{\sigma_{11}}{\sqrt {D_{1}D_{1}}}&\cdots&\frac{\sigma_{11}}{\sqrt {D_{1}D_{1}}}\\ \vdots&\vdots&\ddots&\vdots\\ \frac{\sigma_{11}}{\sqrt {D_{1}D_{1}}}& \frac{\sigma_{11}}{\sqrt {D_{1}D_{1}}}&\cdots&\frac{\sigma_{11}}{\sqrt {D_{1}D_{1}}}\\ \end{bmatrix}
R=⎣⎢⎢⎢⎢⎡D1D1σ11D1D1σ11⋮D1D1σ11D1D1σ11D1D1σ11⋮D1D1σ11⋯⋯⋱⋯D1D1σ11D1D1σ11⋮D1D1σ11⎦⎥⎥⎥⎥⎤
3.关于由协方差矩阵或相关矩阵出发求解主成分
\\
求解主成分的过程实际就是对矩阵结构进行分析的过程,也就是求解特征值的过程。在实际分析过程中,我们既可以从协方差矩阵出发,也可以从相关矩阵出发,其求主成分的过程是一致的。但是从协方差出发和从相关矩阵出发所求得的主成分一般是有差别的。
3.1以从协方差出发求解主成分为例,给出求解主成分的步骤
先求
Σ
\Sigma
Σ的特征向量和特征值,matlab代码为
[u,v]=eig(S) &matlab不支持希腊字母,故用拉丁字母S代表Σ
其中u为由特征向量组成的矩阵,v为特征值组成的对角矩阵
u的表达式为
u
=
[
u
11
u
12
⋯
u
1
p
u
11
u
12
⋯
u
1
p
⋮
⋮
⋱
⋮
u
p
1
u
p
2
⋯
u
p
p
]
u=\begin{bmatrix} u_{11}&u_{12}&\cdots&u_{1p}\\ u_{11}&u_{12}&\cdots&u_{1p}\\ \vdots&\vdots&\ddots&\vdots\\ u_{p1}&u_{p2}&\cdots&u_{pp} \end{bmatrix}
u=⎣⎢⎢⎢⎡u11u11⋮up1u12u12⋮up2⋯⋯⋱⋯u1pu1p⋮upp⎦⎥⎥⎥⎤
v的矩阵为
v
=
[
λ
1
0
⋯
0
0
λ
2
⋯
0
⋮
⋮
⋱
⋮
0
0
⋯
λ
p
]
v=\begin{bmatrix} \lambda_1&0&\cdots&0\\ 0&\lambda_2&\cdots&0\\ \vdots&\vdots&\ddots&\vdots\\ 0&0&\cdots&\lambda_p \end{bmatrix}
v=⎣⎢⎢⎢⎡λ10⋮00λ2⋮0⋯⋯⋱⋯00⋮λp⎦⎥⎥⎥⎤
其中
λ
1
\lambda_1
λ1,
λ
2
\lambda_2
λ2,
⋯
\cdots
⋯,
λ
p
\lambda_p
λp为协方差矩阵
Σ
\Sigma
Σ的特征值,将其依大小顺序排列,不妨设
λ
1
\lambda_1
λ1
⩾
\geqslant
⩾
λ
2
\lambda_2
λ2
⩾
\geqslant
⩾
⋯
\cdots
⋯
⩾
\geqslant
⩾
λ
p
\lambda_p
λp,
γ
1
\gamma_1
γ1,
γ
2
\gamma_2
γ2,
⋯
\cdots
⋯,
γ
p
\gamma_p
γp为矩阵
Σ
\Sigma
Σ各特征值对应的标准正交特征向量,则第
i
i
i个主成分为:
Y
i
Y_i
Yi=
γ
1
i
\gamma_{1i}
γ1i
a
1
a_1
a1+
γ
2
i
\gamma_{2i}
γ2i
a
2
a_2
a2+
⋯
\cdots
⋯+
γ
p
i
\gamma_{pi}
γpi
a
p
a_p
ap,(
i
i
i=1,2,
⋯
\cdots
⋯,
p
p
p),此
γ
\gamma
γ代表的意义与
u
u
u相同
Y
Y
Y的表达式为
Y
Y
Y=
[
Y
1
Y
2
⋯
Y
p
]
\begin{bmatrix} Y_1&Y_2&\cdots&Y_p\\ \end{bmatrix}
[Y1Y2⋯Yp]=A *
u
u
u=
\\
(
x
11
−
μ
1
x
12
−
μ
2
⋯
x
1
p
−
μ
p
x
21
−
μ
1
x
22
−
μ
2
⋯
x
2
p
−
μ
p
⋮
⋮
⋱
⋮
x
p
1
−
μ
1
x
p
2
−
μ
2
⋯
x
p
p
−
μ
p
)
\begin{pmatrix} x_{11}-\mu_1& x_{12}-\mu_2 &\cdots& x_{1p}-\mu_p \\ x_{21}-\mu_1 & x_{22}-\mu_2& \cdots &x_{2p}-\mu_p \\ \vdots&\vdots&\ddots&\vdots\\ x_{p1}-\mu_1 & x_{p2}-\mu_2& \cdots&x_{pp}-\mu_p \\ \end{pmatrix}
⎝⎜⎜⎜⎛x11−μ1x21−μ1⋮xp1−μ1x12−μ2x22−μ2⋮xp2−μ2⋯⋯⋱⋯x1p−μpx2p−μp⋮xpp−μp⎠⎟⎟⎟⎞ *
[
u
11
u
12
⋯
u
1
p
u
21
u
22
⋯
u
1
p
⋮
⋮
⋱
⋮
u
p
1
u
p
2
⋯
u
p
p
]
\begin{bmatrix} u_{11}&u_{12}&\cdots&u_{1p}\\ u_{21}&u_{22}&\cdots&u_{1p}\\ \vdots&\vdots&\ddots&\vdots\\ u_{p1}&u_{p2}&\cdots&u_{pp} \end{bmatrix}
⎣⎢⎢⎢⎡u11u21⋮up1u12u22⋮up2⋯⋯⋱⋯u1pu1p⋮upp⎦⎥⎥⎥⎤
用方程组表示为
{
Y
1
=
a
1
u
11
+
a
2
u
21
+
⋯
+
a
p
u
p
1
Y
2
=
a
1
u
12
+
a
2
u
22
+
⋯
+
a
p
u
p
2
⋮
Y
p
=
a
1
u
1
p
+
a
2
u
2
p
+
⋯
+
a
p
u
p
\left\{ \begin{array}{} Y_1=a_1 u_{11}+a_2u_{21}+\cdots+a_pu_{p1}\\ Y_2=a_1 u_{12}+a_2u_{22}+\cdots+a_pu_{p2}\\ \vdots\\ Y_p=a_1 u_{1p}+a_2u_{2p}+\cdots+a_pu_{p} \end{array} \right.
⎩⎪⎪⎪⎨⎪⎪⎪⎧Y1=a1u11+a2u21+⋯+apup1Y2=a1u12+a2u22+⋯+apup2⋮Yp=a1u1p+a2u2p+⋯+apup
若由相关矩阵求主成分,则
从3.1步开始改变,用相关系数矩阵代表协方差矩阵,求取特征向量和特征值。设得到的特征向量和特征值分别为
u
1
u_1
u1和
v
1
v_1
v1,
Y
Y
Y的表达式为:
Y
∗
=
[
Y
1
∗
Y
2
∗
⋯
Y
p
∗
]
=
B
∗
v
1
=
Y^*=\begin{bmatrix} Y_1^*&Y_2^*&\cdots&Y_p^*\\ \end{bmatrix}=B*v_1=
Y∗=[Y1∗Y2∗⋯Yp∗]=B∗v1=
\\
(
X
11
−
μ
1
D
1
X
12
−
μ
2
D
2
⋯
X
1
p
−
μ
p
D
p
X
21
−
μ
1
D
1
X
22
−
μ
2
D
2
⋯
X
2
p
−
μ
p
D
p
⋮
⋮
⋱
⋮
X
p
1
−
μ
1
D
1
X
p
2
−
μ
2
D
2
⋯
X
p
p
−
μ
p
D
p
)
∗
[
v
11
v
12
⋯
v
1
p
v
21
v
22
⋯
v
1
p
⋮
⋮
⋱
⋮
v
p
1
v
p
2
⋯
v
p
p
]
\begin{pmatrix} \frac{X_{11}-\mu_1}{{\sqrt {D_1}}} & \frac{X_{12}-\mu_2}{{\sqrt {D_2}}} &\cdots& \frac{X_{1p}-\mu_p}{{\sqrt {D_p}}} \\ \frac{X_{21}-\mu_1}{{\sqrt {D_1}}} & \frac{X_{22}-\mu_2}{{\sqrt {D_2}}} &\cdots& \frac{X_{2p}-\mu_p}{{\sqrt {D_p}}} \\ \vdots&\vdots&\ddots&\vdots\\ \frac{X_{p1}-\mu_1}{{\sqrt {D_1}}} & \frac{X_{p2}-\mu_2}{{\sqrt {D_2}}} &\cdots& \frac{X_{pp}-\mu_p}{{\sqrt {D_p}}} \\ \end{pmatrix}* \begin{bmatrix} v_{11}&v_{12}&\cdots&v_{1p}\\ v_{21}&v_{22}&\cdots&v_{1p}\\ \vdots&\vdots&\ddots&\vdots\\ v_{p1}&v_{p2}&\cdots&v_{pp} \end{bmatrix}
⎝⎜⎜⎜⎜⎜⎜⎛D1X11−μ1D1X21−μ1⋮D1Xp1−μ1D2X12−μ2D2X22−μ2⋮D2Xp2−μ2⋯⋯⋱⋯DpX1p−μpDpX2p−μp⋮DpXpp−μp⎠⎟⎟⎟⎟⎟⎟⎞∗⎣⎢⎢⎢⎡v11v21⋮vp1v12v22⋮vp2⋯⋯⋱⋯v1pv1p⋮vpp⎦⎥⎥⎥⎤
用方程组表示为
{
Y
1
∗
=
b
1
v
11
+
b
2
v
21
+
⋯
+
b
p
v
p
1
Y
2
∗
=
b
1
v
12
+
b
2
v
22
+
⋯
+
b
p
v
p
2
⋮
Y
p
∗
=
b
1
v
1
p
+
b
2
v
2
p
+
⋯
+
b
p
v
p
\left\{ \begin{array}{} Y_1^*=b_1 v_{11}+b_2v_{21}+\cdots+b_pv_{p1}\\ Y_2^*=b_1 v_{12}+b_2v_{22}+\cdots+b_pv_{p2}\\ \vdots\\ Y_p^*=b_1 v_{1p}+b_2v_{2p}+\cdots+b_pv_{p} \end{array} \right.
⎩⎪⎪⎪⎨⎪⎪⎪⎧Y1∗=b1v11+b2v21+⋯+bpvp1Y2∗=b1v12+b2v22+⋯+bpvp2⋮Yp∗=b1v1p+b2v2p+⋯+bpvp
\\
4.接下来是实际例子
4.1假定我们研究某一经济问题共涉及两个指标:产值和利税。其中产值以百万计,利税以万元计,得原始资料矩阵如下:
X
=
[
12.5
586
24
754
15.3
850
18
667
31.2
750
]
X=\begin{bmatrix} 12.5&586\\ 24&754\\ 15.3&850\\ 18&667\\ 31.2&750 \end{bmatrix}
X=⎣⎢⎢⎢⎢⎡12.52415.31831.2586754850667750⎦⎥⎥⎥⎥⎤
可以得到原始变量的协方差矩阵与相关阵分别为
Σ
=
[
55.9
242.7
242.7
9927.8
]
,
R
=
[
1
0.3257
0.3257
1
]
\Sigma=\begin{bmatrix} 55.9&242.7\\ 242.7&9927.8 \end{bmatrix},R=\begin{bmatrix} 1&0.3257\\ 0.3257&1 \end{bmatrix}
Σ=[55.9242.7242.79927.8],R=[10.32570.32571]
由协方差阵出发求解主成分,得到结果如下表
[u,v]=eig(S) &matlab不支持希腊字母,故用拉丁字母S代表Σ
u
=
[
−
0.997
0.0246
0.0246
0.997
]
,
v
=
[
49.9
0
0
9933.76
]
u=\begin{bmatrix} -0.997&0.0246\\ 0.0246&0.997\\ \end{bmatrix},v=\begin{bmatrix} 49.9&0\\ 0&9933.76 \end{bmatrix}
u=[−0.9970.02460.02460.997],v=[49.9009933.76]
4.2因此所得的主成分的表达式为
Y
1
∗
=
0.0246
(
x
−
μ
1
)
+
0.997
(
x
−
μ
2
)
Y
2
∗
=
−
0.9997
(
x
−
μ
1
)
+
0.0246
(
x
−
μ
2
)
\begin{array}{} Y_1^*=0.0246( x-\mu_1)+0.997(x-\mu_2)\\ Y_2^*=-0.9997( x-\mu_1)+0.0246(x-\mu_2)\\ \end{array}
Y1∗=0.0246(x−μ1)+0.997(x−μ2)Y2∗=−0.9997(x−μ1)+0.0246(x−μ2)
a
,
b
,
c
,
d
,
e
,
f
,
g
,
h
,
i
,
j
,
k
,
l
,
m
,
n
,
o
,
p
,
q
,
r
,
s
,
t
,
u
,
v
,
w
,
x
,
y
,
z
a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z
a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z
a
‾
a
‾
\overline{a}\overline{a}
aa=
a
a
‾
\overline{aa}
aa
a
^
\widehat{a}
a
a
^
\widehat{a}
a
a
~
a
~
\widetilde{a}\widetilde{a}
a
a