求解Ax=0:主变量,特解-线性代数课时7(MIT Linear Algebra , Gilbert Strang)

        这是Strang教授的第七讲,这节课是一个转折,它从定义转向算法,这节课主要内容是求解矩阵的零空间,通过一个例子讲解了通过消元法求解Ax=0,并在贯通例子的过程中介绍了几个新的概念:特解、主变量、自由变量、主列、自由列、阶梯矩阵U和简化的行阶梯形式,另外讲解了矩阵秩的概念。

特解

        第6讲(列空间和零空间-线性代数课时6(MIT Linear Algebra , Gilbert Strang))介绍零空间时,虽然没给出求解矩阵A零空间的过程,但我们直接给出了例子中矩阵A的特解:

        A=\begin{bmatrix} 1 & 1 & 2\\ 2& 1 & 3\\ 3 &1 & 4\\ 4 & 1 &5 \end{bmatrix}N(A):c\begin{bmatrix} 1\\ 1\\ -1 \end{bmatrix}

       并指出N(A)是过原点的一条直线,上面表达式给出的向量(1,1,-1)就是特解,特解是从解直线上取的一点,由原点指向它的向量,直线上任意取一点除了原点都能当做特解。

        有了特解的概念,那么A的零空间可以这样描述(书中原话):

        The nullspace consists of all combinations of the special solutions.

通过消元法求解Ax=0

        举例说明求解Ax=0的消元算法,首先,要说明消元过程不会改变方程组的解,所以N(A)不会改变,通过消元法求解Ax=0求解N(A)得到的结果是正确的,e.x:

        A=\begin{bmatrix} 1 & 2& 2 &2 \\ 2& 4 & 6 &8 \\ 3& 6 & 8 & 10 \end{bmatrix}\xrightarrow[row2-2row1]{row3-3row1}\begin{bmatrix} 1 &2 &2 &2 \\ 0& 0 &2 &4 \\ 0& 0 &2 & 4 \end{bmatrix}\overset{row3-row2}{\rightarrow}\begin{bmatrix} (1) &2 & 2 & 2\\ 0& 0 & (2) &4 \\ 0& 0 &0 &0 \end{bmatrix}

            =U

        上面消元得到的结果和前面讲的方阵消元得到的结果有点不同,它是一个阶梯形式的矩阵,叫做行阶梯矩阵(Echelon Matrices)。上面消元之后得到的主元1和2在第一列和第三列,包含主元的列叫做主列,消元之后不包含主元的列叫做自由列,例子中是列2和列4。对于Ax=b的解x(x_1,x_2,x_3,x_4)的4个分量,A主列对应的分量x_1,x_3叫做主变量,A自由列对应的分量x_2,x_4叫做自由变量。   

        消元之后的方程Ux=0,通过回代求解x,发现2个方程4个未知数,方程和未知数的个数不对应,这里怎么回代呢?这里回代的关键在于:自由变量可以随意取值,然后自由变量取的值带入方程回代求得主变量,从而求得特解,那么有多少个特解呢?答案是和自由变量的个数相同,回代:

        1.取自由变量x_2=1,x_4=0回代求得Ux=0的一个特解:

                                                  \begin{bmatrix} -2\\ 1\\ 0\\ 0 \end{bmatrix}

        2.取自由变量x_2=0,x_4=1回代求得Ux=0的另外一个特解:

                                                  \begin{bmatrix} 2\\ 0\\ -2\\ 1 \end{bmatrix}

        求得特解之后,怎样给出x的整个解空间(也即N(A)),取所有特解的线性组合:

                                   c_1\begin{bmatrix} -2\\ 1\\ 0\\ 0 \end{bmatrix}+c_2\begin{bmatrix} 2\\ 0\\ -2\\ 1 \end{bmatrix}

简化的行阶梯矩阵R

       简化的行阶梯矩阵R很有用处,用它可以直接给出Ax=0的所有特解。R是在U的基础上再通过向上消元将主元上方的元素也消掉,使得主元上下均为0,接着上面的例子:

        U=\begin{bmatrix} (1)& 2 & 2 & 2\\ 0& 0 &(2) & 4\\ 0 & 0 &0 &0 \end{bmatrix}\overset{row1-row2}{\rightarrow}\begin{bmatrix} (1)& 2 & 0 & -2\\ 0& 0 &(2) & 4\\ 0 & 0 &0 &0 \end{bmatrix}\overset{row2/2}{\rightarrow}\begin{bmatrix} (1)& 2 & 0 & -2\\ 0& 0 &(1) & 2\\ 0 & 0 &0 &0 \end{bmatrix}

            =R

        这就是R,要说明的是Ax=0,Ux=0,Rx=0解是相同的。我们假设A的主列都在自由列的前面,那么R将会有如下的一般形式:

        R=\begin{bmatrix} I &F \\ 0 &0 \end{bmatrix},假设R有r个主元,那么R有r个主列,r个主行,n-r个自由列,m-r行0,通过R可以给出由Ax=0的特解作为列向量的矩阵N,N叫做零空间矩阵(Nullspace matrix)的表达式:

        N=\begin{bmatrix} -F\\ I \end{bmatrix}F:r\times (n-r),I:(n-r)\times(n-r),N是一个n\times(n-r)的矩阵,N的列向量空间就是A的零空间 N(A).

        验证下上面的表达式是正确的,只需要验证RN=0成立:

        RN=\begin{bmatrix} I &F \\ 0&0 \end{bmatrix}\begin{bmatrix} -F\\ I \end{bmatrix}=\begin{bmatrix} -IF+FI\\ 0 \end{bmatrix}=0

        我们也可以正向推导一下,看看表达式的来由,假设:

       N=\begin{bmatrix} X_{pivots}\\ X_{frees} \end{bmatrix}      , 那么,

        RN=\begin{bmatrix} I &F \\ 0& 0 \end{bmatrix}\begin{bmatrix} X_{pivots}\\ X_{frees} \end{bmatrix}=\begin{bmatrix} X_{pivots}+FX_{frees}\\ 0 \end{bmatrix}=0

        使得上面的表达式要成立的条件是X_{pivots}+FX_{frees}=0,所以X_{pivots}=-FX_{frees},因为自由变量可以任意取值,那么取X_{frees}=I,则X_{pivots}=-F,此时:

        N=\begin{bmatrix} -F\\ I \end{bmatrix}.上面的表达式就推导出来了。

矩阵的秩

        矩阵的秩是一个十分重要的概念,它给出了矩阵A的真实大小。定义:

        The rank of A is the number of pivots.This number is r.

        矩阵的秩定义为矩阵A主元的个数,比如说矩阵的秩为1表情矩阵只有一个主元。

        矩阵的秩r表明矩阵mxn的矩阵A只有r个线性无关的列向量和r个向量无关行向量。根据矩阵秩的定义,若矩阵A的秩为r,那么在解Ax=0的时候,有n-r个自由变量可以选取,特解的个数是n-r.

       本节课的内容对应《INTRODUCTION TO LINEAR ALGEBRA》3.2章节的后半部分和3.3章节。

下节课:求解Ax=b:可解性和解的结构-线性代数课时8(MIT Linear Algebra , Gilbert Strang)

  • 3
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
Preface I wrote this book to help machine learning practitioners, like you, get on top of linear algebra, fast. Linear Algebra Is Important in Machine Learning There is no doubt that linear algebra is important in machine learning. Linear algebra is the mathematics of data. It’s all vectors and matrices of numbers. Modern statistics is described using the notation of linear algebra and modern statistical methods harness the tools of linear algebra. Modern machine learning methods are described the same way, using the notations and tools drawn directly from linear algebra. Even some classical methods used in the field, such as linear regression via linear least squares and singular-value decomposition, are linear algebra methods, and other methods, such as principal component analysis, were born from the marriage of linear algebra and statistics. To read and understand machine learning, you must be able to read and understand linear algebra. Practitioners Study Linear Algebra Too Early If you ask how to get started in machine learning, you will very likely be told to start with linear algebra. We know that knowledge of linear algebra is critically important, but it does not have to be the place to start. Learning linear algebra first, then calculus, probability, statistics, and eventually machine learning theory is a long and slow bottom-up path. A better fit for developers is to start with systematic procedures that get results, and work back to the deeper understanding of theory, using working results as a context. I call this the top-down or results-first approach to machine learning, and linear algebra is not the first step, but perhaps the second or third. Practitioners Study Too Much Linear Algebra When practitioners do circle back to study linear algebra, they learn far more of the field than is required for or relevant to machine learning. Linear algebra is a large field of study that has tendrils into engineering, physics and quantum physics. There are also theorems and derivations for nearly everything, most of which will not help you get better skill from or a deeper understanding of your machine learning model. Only a specific subset of linear algebra is required, though you can always go deeper once you have the basics.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值