对于矩阵方程:
A
x
=
[
(
row
1
)
(
row
2
)
.
.
.
(
row
m
)
]
[
x
1
x
2
.
.
.
x
n
]
=
[
(
row
1
)
⋅
x
(
row
2
)
⋅
x
.
.
.
.
(
row
m
)
⋅
x
]
=
b
A x=\left[\begin{array}{ll}(\operatorname{row} 1) \\ (\operatorname{row} 2) \\ ...\\(\operatorname{row} m)\end{array}\right]\left[\begin{array}{ll} x_{1} \\ x_{2} \\ ...\\ x_{n} \end{array}\right]=\left[\begin{array}{ll}(\operatorname{row} 1) \cdot x \\ (\operatorname{row} 2)\cdot x \\ ....\\(\operatorname{row} m) \cdot x\end{array}\right]=b
Ax=⎣⎢⎢⎡(row1)(row2)...(rowm)⎦⎥⎥⎤⎣⎢⎢⎡x1x2...xn⎦⎥⎥⎤=⎣⎢⎢⎡(row1)⋅x(row2)⋅x....(rowm)⋅x⎦⎥⎥⎤=b
当行数m>列数n时,方程可能无解。
而最小二乘法的目的就是让求取 x ^ \widehat{x} x 使得 E = ∥ A x − b ∥ 2 E=\|A x-b\|^{2} E=∥Ax−b∥2 最小。
根据矩阵的导数公式: ∂ ∂ X ( X b + c ) T D ( X b + c ) = ( D + D T ) ( X b + c ) b T \frac{\partial}{\partial \mathbf{X}}(\mathbf{X b}+\mathbf{c})^{T} \mathbf{D}(\mathbf{X b}+\mathbf{c})=\left(\mathbf{D}+\mathbf{D}^{T}\right)(\mathbf{X b}+\mathbf{c}) \mathbf{b}^{T} ∂X∂(Xb+c)TD(Xb+c)=(D+DT)(Xb+c)bT
∂
E
∂
x
=
∂
∂
x
(
A
x
−
b
)
T
(
A
x
−
b
)
=
2
A
T
(
A
x
−
b
)
\frac{\partial E}{\partial x}=\frac{\partial}{\partial x}(A x-b)^{T}(A x-b)=2 A^{T}(A x-b)
∂x∂E=∂x∂(Ax−b)T(Ax−b)=2AT(Ax−b)=0
因此:
A
T
A
x
^
=
A
T
b
A^{\mathrm{T}} A \hat{x}=A^{\mathrm{T}} b
ATAx^=ATb
其中
x
^
\hat{x}
x^ 是最小二乘解!