Chapter 6 (Orthogonality and Least Squares): Least-Squares problems (最小二乘问题)

本文介绍了线性代数及其应用中的最小二乘问题,当线性系统无解时,寻找使误差平方和最小的解。详细讨论了通用最小二乘问题的解决方案,通过矩阵的ATA形式来表达,并指出当矩阵的列正交时,可以更快速地找到最小二乘解。此外,文章还提到了QR分解在处理这类问题时的可靠性优势,尤其是在处理病态系统时避免计算误差的重要性。
摘要由CSDN通过智能技术生成

本文为《Linear algebra and its applications》的读书笔记

Least-Squares problems

  • Inconsistent systems arise often in applications. When a solution is demanded and none exists, the best one can do is to find an x \boldsymbol x x that makes A x A\boldsymbol x Ax as close as possible to b \boldsymbol b b. Think of A x A\boldsymbol x Ax as an a p p r o x i m a t i o n approximation approximation to b \boldsymbol b b. The smaller the distance between b \boldsymbol b b and A x A\boldsymbol x Ax, given by ∥ b − A x ∥ \left\|\boldsymbol b - A\boldsymbol x\right\| bAx, the better the approximation.

  • The general least-squares problem is to find an x \boldsymbol x x that makes ∥ b − A x ∥ \left\|\boldsymbol b - A\boldsymbol x\right\| bAx as small as possible. ∥ b − A x ∥ \left\|\boldsymbol b - A\boldsymbol x\right\| bAx is called the least-squares error of this approximation.
    在这里插入图片描述

The adjective “least-squares” arises from the fact that ∥ b − A x ∥ \left\|\boldsymbol b - A\boldsymbol x\right\| bAx is the square root of a sum of squares.


  • Notice that A x A\boldsymbol x Ax will necessarily be in C o l A ColA ColA. So we seek an x \boldsymbol x x that makes A x A\boldsymbol x Ax the closest point in C o l A ColA ColA to b \boldsymbol b b. See Figure 1.
    在这里插入图片描述

Solution of the General Least-Squares Problem

A T A x = A T b A^TA\boldsymbol x=A^T\boldsymbol b ATAx=ATb

  • Apply the Best Approximation Theorem in Section 6.3 to the subspace C o l A ColA ColA. Let
    b ^ = p r o j C o l A b \hat \boldsymbol b=proj_{ColA}\boldsymbol b b^=projColAbBecause b ^ \hat\boldsymbol b b^ is in C o l A ColA ColA, the equation A x = b ^ A\boldsymbol x =\hat\boldsymbol b Ax=b^ is consistent, and there is an x ^ \hat\boldsymbol x x^ in R n \mathbb R^n Rn such that
    A x ^ = b ^            ( 1 ) A\hat\boldsymbol x =\hat\boldsymbol b\ \ \ \ \ \ \ \ \ \ (1) Ax^=b^          (1)Since b ^ \hat\boldsymbol b b^ is the closest point in C o l A ColA ColA to b \boldsymbol b b, a vector x ^ \hat\boldsymbol x x^ is a least-squares solution of A x = b A\boldsymbol x =\boldsymbol b Ax=b if and only if x ^ \hat\boldsymbol x x^ satisfies (1). See Figure 2. [There are many solutions of (1) if the equation has free variables.]
    在这里插入图片描述
  • Suppose x ^ \hat\boldsymbol x x^ satisfies A x ^ = b ^ A\hat\boldsymbol x =\hat\boldsymbol b Ax^=b^. By the Orthogonal Decomposition Theorem in Section 6.3, b − b ^ \boldsymbol b -\hat\boldsymbol b bb^ is orthogonal to C o l A ColA ColA, so b − A x ^ \boldsymbol b - A\hat\boldsymbol x bAx^ is orthogonal to each column of A A A. If a j \boldsymbol a_j aj is any column of A A A, then a j ⋅ ( b − A x ^ ) = 0 \boldsymbol a_j \cdot (\boldsymbol b- A\hat\boldsymbol x)=0 aj(bAx^)=0, and a j T ( b − A x ^ ) = 0 \boldsymbol a^T_j(\boldsymbol b - A\hat\boldsymbol x)= 0 ajT(bAx^)=0. Since each a j T \boldsymbol a^T_j ajT is a row of A T A^T AT ,
    A T ( b − A x ^ ) = 0         ( 2 ) A^T (\boldsymbol b - A\hat\boldsymbol x)=\boldsymbol 0\ \ \ \ \ \ \ (2) AT(bAx^)=0       (2)(This equation also follows from ( C o l   A ) ⊥ = N u l   A T (Col\ A)^\perp=Nul\ A^T (Col A)=Nul AT.) Thus
    A T b − A T A x ^ = 0 A T A x ^ = A T b \begin{aligned}A^T \boldsymbol b - A^TA\hat\boldsymbol x&=\boldsymbol 0\\ A^TA\hat\boldsymbol x&=A^T \boldsymbol b\end{aligned} ATbATAx^ATAx^=0=ATbThese calculations show that each least-squares solution of A x = b A\boldsymbol x =\boldsymbol b Ax=b satisfies the equation
    在这里插入图片描述The equation represents a system of equations called the normal equations (法方程) for A x = b A\boldsymbol x =\boldsymbol b Ax=b. A solution of (3) is often denoted by x ^ \hat\boldsymbol x x^.

在这里插入图片描述


  • The next theorem gives useful criteria for determining when there is only one least-squares solution of A x = b A\boldsymbol x =\boldsymbol b Ax=b. (Of course, the orthogonal projection b ^ \hat\boldsymbol b b^ is always unique.)

在这里插入图片描述

Formula (4) for x ^ \hat \boldsymbol x x^ is useful mainly for theoretical purposes and for hand calculations when A T A A^TA ATA is a 2 × 2 2\times 2 2×2 invertible matrix.

PROOF

  • If A x = 0 A\boldsymbol x =\boldsymbol 0 Ax=0, then A T A x = 0 A^TA\boldsymbol x =\boldsymbol 0 ATAx=0 ∴ N u l A ⊆ N u l A T A \therefore NulA\subseteq NulA^TA NulANulATA.
  • If A T A x = 0 A^TA\boldsymbol x =\boldsymbol 0 ATAx=0, then x T A T A x = ( A x ) T A x = 0 \boldsymbol x^TA^TA\boldsymbol x =(A\boldsymbol x)^TA\boldsymbol x=\boldsymbol 0 xTATAx=(Ax)TAx=0 ∴ A x = 0 \therefore A\boldsymbol x=\boldsymbol 0 Ax=0 ∴ N u l A T A ⊆ N u l A \therefore NulA^TA\subseteq NulA NulATANulA. ∴ N u l A = N u l A T A ∴ r a n k A T A = n − d i m N u l A T A = n − d i m N u l A = r a n k A \therefore NulA= NulA^TA\\\therefore rankA^TA=n-dimNulA^TA=n-dimNulA=rankA NulA=NulATArankATA=ndimNulATA=ndimNulA=rankASo when A A A has n n n linearly independent columns, r a n k A T A = r a n k A = n rankA^TA=rankA=n rankATA=rankA=n, which means A T A A^TA ATA is an invertible matrix.

A A A 的列正交时,快速找到最小二乘解

  • The next example shows how to find a least-squares solution of A x = b A\boldsymbol x =\boldsymbol b Ax=b when the columns of A A A are orthogonal. Such matrices often appear in linear regression problems.

EXAMPLE 4

  • Find a least-squares solution of A x = b A\boldsymbol x =\boldsymbol b Ax=b for
    在这里插入图片描述

SOLUTION

  • Because the columns a 1 \boldsymbol a_1 a1 and a 2 \boldsymbol a_2 a2 of A A A are orthogonal, the orthogonal projection of b \boldsymbol b b onto C o l A ColA ColA is given by
    在这里插入图片描述Now that b ^ \hat \boldsymbol b b^ is known, we can solve A x ^ = b ^ A\hat\boldsymbol x=\hat\boldsymbol b Ax^=b^. But this is trivial, since we already know what weights to place on the columns of A A A to produce b ^ \hat\boldsymbol b b^. It is clear from (5) that
    在这里插入图片描述

  • In some cases, the normal equations for a least-squares problem can be i l l ill ill- c o n d i t i o n e d ( 病 态 的 ) conditioned(病态的) conditioned(); that is, small errors in the calculations of the entries of A T A A^TA ATA can sometimes cause relatively large errors in the solution x ^ \hat \boldsymbol x x^.

利用 QR 分解

  • If the columns of A A A are linearly independent, the least-squares solution can often be computed more reliably through a Q R QR QR factorization of A A A (described in Section 6.4).

在这里插入图片描述
在这里插入图片描述

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值