MIT线性代数笔记-第16讲-投影矩阵,最小二乘

16.投影矩阵,最小二乘

  1. 证明两个极端情况

    证明当 b ⃗ \vec{b} b A A A的列空间里时,它的投影为它本身:

    ​ 设 b ⃗ = A x ⃗ \vec{b} = A \vec{x} b =Ax ,则 P b ⃗ = A ( A T A ) − 1 A T A x ⃗ = A x ⃗ = b ⃗ P \vec{b} = A (A^T A)^{-1} A^T A \vec{x} = A \vec{x} = \vec{b} Pb =A(ATA)1ATAx =Ax =b

    证明当 b ⃗ \vec{b} b A A A的列空间正交时,它的投影为 0 ⃗ \vec{0} 0

    ​ 因为 b ⃗ \vec{b} b A A A的每一列都正交,所以 A T b ⃗ = 0 ⃗ A^{T} \vec{b} = \vec{0} ATb =0 ,所以 P b ⃗ = A ( A T A ) − 1 A T b ⃗ = 0 ⃗ P \vec{b} = A (A^{T} A)^{-1} A^{T} \vec{b} = \vec{0} Pb =A(ATA)1ATb =0

  2. 当使用 P P P b ⃗ \vec{b} b 投影到某个空间中得到 p ⃗ \vec{p} p 时, b ⃗ \vec{b} b 分解出的另一个向量 e ⃗ \vec{e} e 即为 b ⃗ \vec{b} b 在该空间的任意正交补中的投影,此时那个正交补的投影矩阵即为 I − P I - P IP,由此可以得到两个互为正交补的空间的投影矩阵之和为 I I I

  3. 最小二乘法

    点为 ( x 1 , y 1 ) , ⋯   , ( x n , y n ) (x_{1} , y_{1}) , \cdots , (x_{n} , y_{n}) (x1,y1),,(xn,yn),设拟合直线为 y ^ = b ^ x + a ^ \widehat{y} = \widehat{b} x + \widehat{a} y =b x+a

    A = [ x 1 1 ⋮ ⋮ x n 1 ] , x ⃗ ^ = [ b ^ a ^ ] , b ⃗ = [ y 1 ⋮ y n ] A = \begin{bmatrix} x_{1} & 1 \\ \vdots & \vdots \\ x_{n} & 1 \end{bmatrix} , \widehat{\vec{x}} = \begin{bmatrix} \widehat{b} \\ \widehat{a} \end{bmatrix} , \vec{b} = \begin{bmatrix} y_{1} \\ \vdots \\ y_{n} \end{bmatrix} A= x1xn11 ,x =[b a ],b = y1yn

    要使总误差最小,需要在 A A A的列空间中找到一个 b ⃗ ′ \vec{b}^{'} b 使得 ∣ b ⃗ − b ⃗ ′ ∣ |\vec{b} - \vec{b}^{'}| b b 最小(也就是使 ( y 1 − y 1 ^ ) 2 + ⋯ + ( y n − y n ^ ) 2 (y_1 - \widehat{y_1})^2 + \cdots + (y_{n} - \widehat{y_n})^2 (y1y1 )2++(ynyn )2最小)

    证明 b ⃗ ′ = p ⃗ \vec{b}^{'} = \vec{p} b =p 时误差最小:

    ​    此时误差为 ∣ e ⃗ ∣ |\vec{e}| e

    ​    若选取除 p ⃗ \vec{p} p 以外的 A A A的列空间中的向量,设其为 b ⃗ ′ = p ⃗ + a ⃗ \vec{b}^{'} = \vec{p} + \vec{a} b =p +a ,那么 b ⃗ − b ⃗ ′ = e ⃗ − a ⃗ \vec{b} - \vec{b}^{'} = \vec{e} - \vec{a} b b =e a

    ​    已知 e ⃗ \vec{e} e 垂直于 A A A的列空间中的任意向量,又 a ⃗ \vec{a} a 属于 A A A的列空间,所以 e ⃗ \vec{e} e a ⃗ \vec{a} a 垂直

    ​    因而 ∣ e ⃗ − a ⃗ ∣ 2 = e ⃗ 2 + a ⃗ 2 > e ⃗ 2 |\vec{e} - \vec{a}|^2 = \vec{e}^2 + \vec{a}^2 > \vec{e}^2 e a 2=e 2+a 2>e 2,即 ∣ e ⃗ − a ⃗ ∣ > ∣ e ⃗ ∣ |\vec{e} - \vec{a}| > |\vec{e}| e a >e

    ​    所以选取除 p ⃗ \vec{p} p 以外的 A A A的列空间中的向量都会导致误差增大

    p ⃗ = ( y 1 ^ , ⋯   , y n ^ ) , e ⃗ = ( y 1 − y 1 ^ , ⋯   , y n − y n ^ ) \vec{p} = (\widehat{y_1} , \cdots , \widehat{y_n}) , \vec{e} = (y_1 - \widehat{y_1} , \cdots , y_n - \widehat{y_n}) p =(y1 ,,yn ),e =(y1y1 ,,ynyn )

    推导最小二乘法的公式:

    ​    法一: ∵ P = A ( A T A ) − 1 A T \because P = A (A^T A)^{-1} A^T P=A(ATA)1AT

    ​        ∴ p ⃗ = [ x 1 1 ⋮ ⋮ x n 1 ] ( [ x 1 ⋯ x n 1 ⋯ 1 ] [ x 1 1 ⋮ ⋮ x n 1 ] ) − 1 [ x 1 ⋯ x n 1 ⋯ 1 ] [ y 1 ⋮ y n ] = [ x 1 1 ⋮ ⋮ x n 1 ] [ ∑ x i 2 n x ‾ n x ‾ n ] − 1 [ ∑ x i y i n y ‾ ] = [ x 1 1 ⋮ ⋮ x n 1 ] [ 1 t − n x ‾ n t − x ‾ t ∑ x i 2 n t ] [ ∑ x i y i n y ‾ ] = [ n ( x 1 − x ‾ ) n t ∑ x i 2 − n x ‾ x 1 n t ⋮ ⋮ n ( x n − x ‾ ) n t ∑ x i 2 − n x ‾ x n n t ] [ ∑ x i y i n y ‾ ] = [ 1 t ( ∑ x i y i ( x 1 − x ‾ ) + y ‾ ( ∑ x i 2 − n x ‾ x 1 ) ) ⋮ 1 t ( ∑ x i y i ( x n − x ‾ ) + y ‾ ( ∑ x i 2 − n x ‾ x n ) ) ] \begin{aligned} \therefore \vec{p} & = \begin{bmatrix} x_1 & 1 \\ \vdots & \vdots \\ x_n & 1 \end{bmatrix} (\begin{bmatrix} x_1 & \cdots & x_n \\ 1 & \cdots & 1 \end{bmatrix} \begin{bmatrix} x_1 & 1 \\ \vdots & \vdots \\ x_n & 1 \end{bmatrix})^{-1} \begin{bmatrix} x_1 & \cdots & x_n \\ 1 & \cdots & 1 \end{bmatrix} \begin{bmatrix} y_1 \\ \vdots \\ y_n \end{bmatrix} \\ & = \begin{bmatrix} x_1 & 1 \\ \vdots & \vdots \\ x_n & 1 \end{bmatrix} \begin{bmatrix} \sum{x_i^2} & n \overline{x} \\ n \overline{x} & n \end{bmatrix}^{-1} \begin{bmatrix} \sum{x_i y_i} \\ n \overline{y} \end{bmatrix} \\ & = \begin{bmatrix} x_1 & 1 \\ \vdots & \vdots \\ x_n & 1 \end{bmatrix} \begin{bmatrix} \dfrac{1}{t} & \dfrac{-n \overline{x}}{nt} \\ \dfrac{-\overline{x}}{t} & \dfrac{\sum{x_i^2}}{nt} \end{bmatrix} \begin{bmatrix} \sum{x_i y_i} \\ n \overline{y} \end{bmatrix} \\ & = \begin{bmatrix} \dfrac{n (x_1 - \overline{x})}{nt} & \dfrac{\sum{x_i^2} - n \overline{x} x_1}{nt} \\ \vdots & \vdots \\ \dfrac{n (x_n - \overline{x})}{nt} & \dfrac{\sum{x_i^2} - n \overline{x} x_n}{nt} \end{bmatrix} \begin{bmatrix} \sum{x_i y_i} \\ n \overline{y} \end{bmatrix} \\ & = \begin{bmatrix} \dfrac{1}{t} (\sum{x_i y_i} (x_1 - \overline{x}) + \overline{y} (\sum{x_i^2} - n \overline{x} x_1)) \\ \vdots \\ \dfrac{1}{t} (\sum{x_i y_i} (x_{n} - \overline{x}) + \overline{y} (\sum{x_i^2} - n \overline{x} x_n)) \end{bmatrix} \end{aligned} p = x1xn11 ([x11xn1] x1xn11 )1[x11xn1] y1yn = x1xn11 [xi2nxnxn]1[xiyiny]= x1xn11 t1txntnxntxi2 [xiyiny]= ntn(x1x)ntn(xnx)ntxi2nxx1ntxi2nxxn [xiyiny]= t1(xiyi(x1x)+y(xi2nxx1))t1(xiyi(xnx)+y(xi2nxxn))

    ​       其中, t = ∑ x i 2 − n x ‾ 2 t = \sum{x_i^2 - n \overline{x}^2} t=xi2nx2

    ​       代入 ( x 1 , y 1 ) , ( x 2 , y 2 ) (x_1 , y_1) , (x_2 , y_2) (x1,y1),(x2,y2)得:

    ​        { b ^ x 1 + a = 1 t ( ∑ x i y i ( x 1 − x ‾ ) + y ‾ ( ∑ x i 2 − n x ‾ x 1 ) b ^ x 2 + a = 1 t ( ∑ x i y i ( x 2 − x ‾ ) + y ‾ ( ∑ x i 2 − n x ‾ x 2 ) ) \left\{\begin{matrix} \widehat{b} x_1 + a = \dfrac{1}{t} (\sum{x_i y_i} (x_1 - \overline{x}) + \overline{y} (\sum{x_i^2} - n \overline{x} x_1) \\ \widehat{b} x_2 +a = \dfrac{1}{t} (\sum{x_i y_i} (x_2 - \overline{x}) + \overline{y} (\sum{x_i^2} - n \overline{x} x_2)) \end{matrix}\right. b x1+a=t1(xiyi(x1x)+y(xi2nxx1)b x2+a=t1(xiyi(x2x)+y(xi2nxx2))

    ​       解得: { b ^ = 1 t ( ∑ x i y i − n x ‾ y ‾ ) = ∑ x i y i − n x ‾ y ‾ ∑ x i 2 − n x ‾ 2 a ^ = y ‾ − b ^ x ‾ \left\{\begin{matrix} \widehat{b} = \dfrac{1}{t} (\sum{x_i y_i} - n \overline{x} \overline{y}) = \dfrac{\sum{x_i y_i} - n \overline{x} \overline{y}}{\sum x_i^2 - n \overline{x}^2} \\ \widehat{a} = \overline{y} - \widehat{b} \overline{x} \end{matrix}\right. b =t1(xiyinxy)=xi2nx2xiyinxya =yb x

    ​    法二: 求拟合直线时,由上一讲可得 A T b ⃗ = A T A x ⃗ A^T \vec{b} = A^T A \vec{x} ATb =ATAx

    ​       可以考虑这么计算: A T [ A ∣ b ⃗ ] = [ A T A ∣ A T b ⃗ ] A^T \begin{bmatrix} A | \vec{b} \end{bmatrix} = \begin{bmatrix} A^T A | A^T \vec{b} \end{bmatrix} AT[Ab ]=[ATAATb ],再使用 A T b ⃗ = A T A x ⃗ A^T \vec{b} = A^T A \vec{x} ATb =ATAx 列出方程组得到 x ⃗ \vec{x} x

    ​       有 A T b ⃗ = [ ∑ x i y i n y ‾ ] , A T A = [ ∑ x i 2 n x ‾ n x ‾ n ] A^T \vec{b} = \begin{bmatrix} \sum{x_{i} y_{i}} \\ n \overline{y} \end{bmatrix} , A^T A = \begin{bmatrix} \sum x_i^2 & n \overline{x} \\ n \overline{x} & n \end{bmatrix} ATb =[xiyiny],ATA=[xi2nxnxn]

    ​       所以可以列出的方程组为: { ∑ x i 2 b ^ + n x ‾ a ^ = ∑ x i y i n x ‾ b ^ + n a ^ = n y ‾ \left\{\begin{matrix} \sum x_i^2 \widehat{b} + n \overline{x} \widehat{a} = \sum{x_i y_i} \\ n \overline{x} \widehat{b} + n \widehat{a} = n \overline{y} \end{matrix}\right. {xi2b +nxa =xiyinxb +na =ny,解得: { b ^ = ∑ x i y i − n x ‾ y ‾ ∑ x i 2 − n x ‾ 2 a ^ = y ‾ − b ^ x ‾ \left\{\begin{matrix} \widehat{b} = \dfrac{\sum{x_i y_i} - n \overline{x} \overline{y}}{\sum x_i^2 - n \overline{x}^2} \\ \widehat{a} = \overline{y} - \widehat{b} \overline{x} \end{matrix}\right. b =xi2nx2xiyinxya =yb x

    例: ( 1 , 1 ) , ( 2 , 2 ) , ( 3 , 2 ) (1,1) , (2,2) , (3,2) (1,1),(2,2),(3,2)三点的一条拟合直线

    ​   设直线为 y ^ = b ^ x + a ^ \widehat{y} = \widehat{b} x + \widehat{a} y =b x+a

    ​   有 { b ^ + a ^ = 1 2 b ^ + a ^ = 2 3 b ^ + a ^ = 2 \left\{\begin{matrix} \widehat{b} + \widehat{a} = 1 \\ 2\widehat{b} + \widehat{a} = 2 \\ 3\widehat{b} + \widehat{a} = 2 \end{matrix}\right. b +a =12b +a =23b +a =2,即 [ 1 1 2 1 3 1 ] [ b ^ a ^ ] = [ 1 2 2 ] A x ⃗ ^ b ⃗ \begin{matrix} \begin{bmatrix} 1 & 1 \\ 2 & 1 \\ 3 & 1 \end{bmatrix} & \begin{bmatrix} \widehat{b} \\ \widehat{a} \end{bmatrix} & = & \begin{bmatrix} 1 \\ 2 \\ 2 \end{bmatrix} \\ A & \widehat{\vec{x}} & & \vec{b} \end{matrix} 123111 A[b a ]x = 122 b

    ​   易得 b ⃗ \vec{b} b 不属于 A A A的列空间

    ​   由 [ 1 2 3 1 1 1 ] [ 1 1 ∣ 1 2 1 ∣ 2 3 1 ∣ 2 ] = [ 14 6 ∣ 11 6 3 ∣ 5 ] A T [ A ∣ b ⃗ ] [ A T A ∣ A T b ⃗ ] \begin{matrix} \begin{bmatrix} 1 & 2 & 3 \\ 1 & 1 & 1 \end{bmatrix} & \begin{bmatrix} 1 & 1 & | & 1 \\ 2 & 1 & | & 2 \\ 3 & 1 & | & 2 \end{bmatrix} & = & \begin{bmatrix} 14 & 6 & | & 11 \\ 6 & 3 & | & 5 \end{bmatrix} \\ A^T & \begin{bmatrix} A | \vec{b} \end{bmatrix} & & \begin{bmatrix} A^T A | A^T \vec{b} \end{bmatrix} \end{matrix} [112131]AT 123111122 [Ab ]=[14663115][ATAATb ]可得: { 14 b ^ + 6 a ^ = 11 6 b ^ + 3 a ^ = 5 \left\{\begin{matrix} 14 \widehat{b} + 6 \widehat{a} = 11 \\ 6 \widehat{b} + 3 \widehat{a} = 5 \end{matrix}\right. {14b +6a =116b +3a =5,解得: { b ^ = 1 2 a ^ = 2 3 \left\{\begin{matrix} \widehat{b} = \dfrac{1}{2} \\ \widehat{a} = \dfrac{2}{3} \end{matrix}\right. b =21a =32

    ​   所以 y ^ = 1 2 x + 2 3 \widehat{y} = \dfrac{1}{2} x + \dfrac{2}{3} y =21x+32


打赏

制作不易,若有帮助,欢迎打赏!
赞赏码

支付宝付款码

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

寒蜩

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值