D = { ( x 1 , y 1 ) , ( x 2 , y 2 ) , ⋯ , ( x N , y N ) } x i ∈ R p , y i ∈ R , i = 1 , 2 , ⋯ , N X = ( x 1 x 2 ⋯ x N ) T = ( x 1 T x 2 T ⋮ x N T ) = ( x 11 x 12 ⋯ x 1 p x 21 x 22 ⋯ x 2 p ⋮ ⋮ ⋮ x N 1 x N 2 ⋯ x N p ) N × p Y = ( y 1 y 2 ⋮ y N ) N × 1 \begin{gathered} D=\left\{(x_{1},y_{1}),(x_{2},y_{2}),\cdots ,(x_{N},y_{N})\right\}\\ x_{i}\in \mathbb{R}^{p},y_{i}\in \mathbb{R},i=1,2,\cdots ,N\\ X=\begin{pmatrix} x_{1} & x_{2} & \cdots & x_{N} \end{pmatrix}^{T}=\begin{pmatrix} x_{1}^{T} \\ x_{2}^{T} \\ \vdots \\ x_{N}^{T} \end{pmatrix}=\begin{pmatrix} x_{11} & x_{12} & \cdots & x_{1p} \\ x_{21} & x_{22} & \cdots & x_{2p} \\ \vdots & \vdots & & \vdots \\ x_{N1} & x_{N2} & \cdots & x_{Np} \end{pmatrix}_{N \times p}\\ Y=\begin{pmatrix} y_{1} \\ y_{2} \\ \vdots \\ y_{N} \end{pmatrix}_{N \times 1} \end{gathered} D={ (x1,y1),(x2,y2),⋯,(xN,yN)}xi∈Rp,yi∈R,i=1,2,⋯,NX=(x1x2⋯xN)T=⎝ ⎛x1Tx2T⋮xNT⎠ ⎞=⎝ ⎛x11x21⋮xN1x12x22⋮xN2⋯⋯⋯x1px2p⋮xNp⎠ ⎞N×pY=⎝ ⎛y1y2⋮yN⎠ ⎞N×1
因此,对于最小二乘估计,有
L ( ω ) = ∑ i = 1 N ∣ ∣ ω T x i − y i ∣ ∣ 2 = ∑ i = 1 N ( ω T x i − y i ) 2 = ( ω T x 1 − y 1 ω T x 2 − y 2 ⋯ ω T x N − y N ) ( ω T x 1 − y 1 ω T x 2 − y 2 ⋮ ω T x N − y N ) = [ ( ω T x 1 ω T x 2 ⋯ ω T x N ) − ( y 1 y 2 ⋯ y N ) ] ( ω T x 1 − y 1 ω T x 2 − y 2 ⋮ ω T x N − y N ) = [ ω T ( x 1 x 2 ⋯ x N ) − ( y 1 y 2 ⋯ y N ) ] ( ω T x 1 − y 1 ω T x 2 − y 2 ⋮ ω T x N − y N ) = ( ω T X T − Y T ) ( ω T x 1 − y 1 ω T x 2 − y 2 ⋮ ω T x N − y N ) = ( ω T X T − Y T ) ( X ω − Y ) = ω T X T X ω − 2 ω T X T Y + Y T Y \begin{aligned} L(\omega)&=\sum\limits_{i=1}^{N}||\omega^{T}x_{i}-y_{i}||^{2}\\ &=\sum\limits_{i=1}^{N}(\omega^{T}x_{i}-y_{i})^{2}\\ &=\begin{pmatrix} \omega^{T}x_{1}-y_{1} & \omega^{T}x_{2}-y_{2} & \cdots & \omega^{T}x_{N}-y_{N} \end{pmatrix}\begin{pmatrix} \omega^{T}x_{1}-y_{1} \\ \omega^{T}x_{2}-y_{2} \\ \vdots \\ \omega^{T}x_{N}-y_{N} \end{pmatrix}\\ &=[\begin{pmatrix} \omega^{T}x_{1} & \omega^{T}x_{2} & \cdots & \omega^{T}x_{N} \end{pmatrix}-\begin{pmatrix} y_{1} & y_{2} & \cdots & y_{N} \end{pmatrix}]\begin{pmatrix} \omega^{T}x_{1}-y_{1} \\ \omega^{T}x_{2}-y_{2} \\ \vdots \\ \omega^{T}x_{N}-y_{N} \end{pmatrix}\\ &=[\omega^{T}\begin{pmatrix} x_{1} & x_{2} & \cdots & x_{N} \end{pmatrix}-\begin{pmatrix} y_{1} & y_{2} & \cdots & y_{N} \end{pmatrix}]\begin{pmatrix} \omega^{T}x_{1}-y_{1} \\ \omega^{T}x_{2}-y_{2} \\ \vdots \\ \omega^{T}x_{N}-y_{N} \end{pmatrix}\\ &=(\omega^{T}X^{T}-Y^{T})\begin{pmatrix} \omega^{T}x_{1}-y_{1} \\ \omega^{T}x_{2}-y_{2} \\ \vdots \\ \omega^{T}x_{N}-y_{N} \end{pmatrix}\\ &=(\omega^{T}X^{T}-Y^{T})(X \omega-Y)\\ &=\omega^{T}X^{T}X \omega-2 \omega^{T}X^{T}Y+Y^{T}Y \end{aligned} L(ω)=i=1∑N∣∣ωTxi−yi∣∣2=i=1∑N(ωTxi