第三章 最小二乘问题

摘要:本篇主要对教材第三章有关最小二乘的重要概念做一总结

3.1 The Discrete Least Square Problem

问题描述:A task that occurs frequently in scientific investigations is that of finding a straight line that “fits” some set of data points.

Ax=b,whereARn×m,xRm,bRn,n>m(overdeterminedsystem)

we want to minimize the residual:
||r||2=||bAx||2

3.1.1 使用二范数的最小二乘的统计意义

The choice of the 2-norm can be justified on the statistical grounds. Suppose the data fail to lie on a straight line because of errors in the measured yi . If the errors are independent and normally distributed with zero and variance σ2 , then the solution of the least square problem is the maximum likelihood estimator of the true solution. 即,最小二乘的结果是最大似然估计。

3.2 ORTHOGONAL MATRICES, ROTATORS, AND RELECTORS

3.2.1 Orghogonal Matrices

Def: A matrix QRn×n is said to be orthogonal if QQT=I. This equation says that Q has an inverse, and Q1=QT.

  1. Orthogonal transformations preserve lengths and angles.

    • (a) <Qx,Qy>=<x,y> <script type="math/tex" id="MathJax-Element-9"> = </script>
    • (b) ||Qx||2=||x||2
  2. 有两种正交变换:rotator and reflector.

    • rotator表示旋转
    • reflector表示沿一条轴线做镜面变换。
    • 二者的目的都是把某一个列向量跟坐标轴对齐
    • All matrix computations build upon rotators and reflectors are normwise backward stable.

3.2.2 Rotators

Theorem 3.2.20
Let ARn×n. Then there exists an orthogonal matrix Q and an upper triangular matrix R such that A=QR. 即,任意方阵都有QR分解。

3.2.3 Reflectors

  1. Theorem 3.2.23

    Let uRn with ||u||2=1, and define PRn×nbyP=uuT . then

    • (a) Pu=u
    • (b) Pv=0if<u,v>=0
    • (c) P2=P
    • (d) PT=P
    • (e) P=uuT has rank 1, since its range consists of multiples of u
  2. Theorem 3.2.26

    Let uRn with ||u||2=1, and define QRn×nbyQ=I2uuT . then

    • (a) Qu=u
    • (b) Qv=vif<u,v>=0
    • (c) Q=QT ( Q is symmetric)
    • (d) QT=Q1 ( Q is orthogonal)
    • (e) Q1=Q ( Q is an involution)

      Matrices Q=I2uuT ( ||u||2 =1) are called reflectors or Householder transformations

  3. 如何避免overflow和underflow

    Since squaring doubles the exponents, an overflow can occur if some of the entires are very large. Likewise underflow can occur if some of the entries are very small. Obviously we must avoid overflows; underflows can also occasionally be dangerous.

    因此需要对数据进行缩放,常用方法是,将数据统一除以最大绝对值。(Page 199)
    
  4. 矩阵乘法运算顺序不同,计算量也不同

    The amount of work required to compute uvTB depends dramatically upon the order in which the operations are perfhomed. Supposed that uRn,vRn,andBRn×m

    • (a) (uvT)B 的计算量是 2n2m
    • (b) u(vTB) 的计算量是 3nm
    • 因此 Q 应该保存为Q=IγuuT并使用(b)的方式参与 QB 运算。
  5. Uniqueness of the QR Decomposition

    Theorem 3.2.46 Let ARn×n be nonsingular. There exist unique Q,RRn×n such that Q is orthogonal, R is upper triangular with positive main-diagonal entries, and A=QR . 即,当矩阵 A 满秩时,QR分解唯一(在保证R的对角线元素皆为正的情况下)。

3.3 Solution of the Least Square Problem

Theorem 3.3.12
Let ARn×mandbRn,n>m. Then the least squares problem for the overdetermined system Ax=b always has a solution. If rank(A) < m, there are infinitely many solutions.

3.4 THE GRAM-SCHMIDT PROCESS

3.4.1 Theorem 3.4.2

Let QRn×n . Then Q is an orthogonal matrix if and only if its columns(rows) form an orthonormal set. 即,Q是正交矩阵当且仅当行(列)向量组成标准正交集。

3.4.2 The Gram-Schmidt orthogonalization is the same as the QR decomposition.

3.5 GEOMETRIC APPROACH TO THE LEAST SQUARES PROBLEM

3.5.1 Definitions

  1. orthogonal complement

    The orghogonal complement of S, denoted Sperp , is defined to be the set of vectors in R^n that are orthogonal to S . That is,

    Sperp=xRn|<x,y>=0forallyS

  2. Null space(kernel)
    N(A)=xRm|Ax=0.

即,使 Ax 等于0的向量集合。
  1. Range
    R(A)=Ax|xRm.

3.5.2 Theorem 3.5.3

Let S be any subspace of R^n. Then for every xRn, there exist unique elements sS and sperpSperp for which x=s+sperp.

3.5.3 Normal Equation

Corollary 3.5.20:
Let xRm . Then

||bAx||2=minwRm||bAw||2
if and only if
bAxR(A)perp

即, Ax 应该是 b R(A)上的垂直投影,那么 bAx 就会垂直于 R(A).

Let xRm. Then x solves the least squares problem for the system Ax=b if and only if

ATAx=ATb

有引理 3.5.20和 R(A)perp=N(AT) 很容易推出Normal Equation.

3.5.4 The coefficient matrix of the normal equations is positive semidefinite. If rank(A)=m , then ATA is positive definite.

PROOF: 非常简单

xT(ATA)x=xTATAx=(Ax)T(Ax)

记, Ax=u ,则上式变为 uTu 易知 uTu0 且只有 u=0 时等号成立,又如果 A 满秩,则只有x=0的时候等号成立。

QR分解的relfector和rotator是backward stable的,而Normal Equation中计算ATA引入的浮点误差可能导致ATA不正定,除非矩阵A的条件数很小。

3.5 其它

  1. The Continuous Least Squares Problem
  2. Updating the QR Decomposition
    即,当矩阵 A 不断增加一行一列时,如果根据之前的QR分解计算出当前 QR 分解。
  3. 求解Least Square的方法
    • QR分解
    • Normal Equation
    • LM
    • 机器学习里的梯度下降法
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值