线性代数基础:
假设向量 A = [ a b c d ] A = \begin{bmatrix} a & b & c & d \end{bmatrix} A=[abcd],则 A T = [ a b c d ] A^T = \begin{bmatrix} a\\ b\\ c\\ d \end{bmatrix} AT=⎣⎢⎢⎡abcd⎦⎥⎥⎤
A 2 = A ∗ A T = [ a b c d ] ∗ [ a b c d ] A^2 = A * A^T = \begin{bmatrix} a & b & c & d \end{bmatrix} * \begin{bmatrix} a\\ b\\ c\\ d \end{bmatrix} A2=A∗AT=[abcd]∗⎣⎢⎢⎡abcd⎦⎥⎥⎤
矩阵求导基础:
∂ x T ∂ x = 1 \frac{\partial x^T}{\partial x} = 1 ∂x∂xT=1 | ∂ x ∂ x T = 1 \frac{\partial x}{\partial x^T} = 1 ∂xT∂x=1 |
∂ A x ∂ x = A T \frac{\partial Ax}{\partial x} = A^T ∂x∂Ax=AT | ∂ x A ∂ x = A T \frac{\partial xA}{\partial x} = A^T ∂x∂xA=AT |
∂ A x ∂ x T = A \frac{\partial Ax}{\partial x^T} = A ∂xT∂Ax=A | ∂ x T A ∂ x = A \frac{\partial x^TA}{\partial x} = A ∂x∂xTA=A |
损失函数:
J ( θ ) = ( h ( x 1 ) − y 1 ) 2 + ( h ( x 2 ) − y 2 ) 2 + . . . + ( h ( x m ) − y m ) 2 = ∑ i = 1 m ( h ( x i ) − y i ) 2 = ( h ( x ) − y ) T ∗ ( h ( x ) − y ) J(\theta) = (h(x_1) - y_1)^2 + (h(x_2) - y_2)^2 + ... + (h(x_m) - y_m)^2 \\ = \sum_{i = 1}^{m}(h(x_i) - y_i)^2 \\ = ( h(x) - y)^T * ( h(x) - y) J(θ)=(h(x1)−y1)2+(h(x2)−y2)2+...+(h(xm)−ym)2=i=1∑m(h(xi)−yi)2=(h(x)−y)T∗(h(x)−y)
正规方程推导
- 把该损失函数转换成矩阵写法:
J ( θ ) = ( X θ − y ) T ∗ ( X θ − y ) J(\theta) = (X\theta - y )^T * (X\theta - y) J(θ)=(Xθ−y)T∗(Xθ−y)
其中y是真实值矩阵,X是特征值矩阵, θ \theta θ是权重矩阵
- 为了取到代价函数对最小值,所以对损失函数进行求导并使其等于0
∂ ( J ( θ ) ) ∂ ( θ ) = ∂ [ ( X T θ T − y T ) ∗ ( X θ − y ) ] ∂ ( θ ) \frac{\partial(J(\theta))}{\partial(\theta)} = \frac{\partial[(X^T\theta^T - y^T) * (X\theta - y)]}{\partial(\theta)} ∂(θ)∂(J(θ))=∂(θ)∂[(XTθT−yT)∗(Xθ−y)]
= ∂ ( X T θ T X θ − X T θ T y − y T X θ + y y T ) ∂ ( θ ) = \frac{\partial(X^T\theta^TX\theta - X^T\theta^Ty - y^TX\theta + yy^T)}{\partial(\theta)} =∂(θ)∂(XTθTXθ−XTθTy−yTXθ+yyT)
= X T X θ + X θ X T − X T y − y X T = 0 = X^TX\theta + X\theta X^T - X^Ty - yX^T = 0 =XTXθ+XθXT−XTy−yXT=0
故:
X
T
X
θ
=
X
T
y
X^TX\theta = X^Ty
XTXθ=XTy
( X T X ) ∗ ( X T X ) − 1 θ = X T y ∗ ( X T X ) − 1 (X^TX)*(X^TX)^{-1}\theta = X^Ty*(X^TX)^{-1} (XTX)∗(XTX)−1θ=XTy∗(XTX)−1
θ = ( X T X ) − 1 X T y \theta = (X^TX)^{-1}X^Ty θ=(XTX)−1XTy
由此推导出正规方程