Chapter 1 (Linear Equations in Linear Algebra): Introduction to linear transformations (线性变换)

本文为《Linear algebra and its applications》的读书笔记

Transformations

  • The difference between a matrix equation A x = b A\boldsymbol x=\boldsymbol b Ax=b and the associated vector equation x 1 a 2 + . . . + x n a n = b x_1\boldsymbol a_2+...+x_n\boldsymbol a_n=\boldsymbol b x1a2+...+xnan=b is merely a matter of notation.
  • However, a matrix equation A x = b A\boldsymbol x=\boldsymbol b Ax=b can arise in a way that is not directly connected with linear combinations of vectors. This happens when we think of the matrix A A A as an object that “acts” on a vector x \boldsymbol x x by multiplication to produce a new vector called A x A\boldsymbol x Ax.
    • For instance, the equations
      在这里插入图片描述say that multiplication by A A A transforms x \boldsymbol x x into b \boldsymbol b b and transforms u \boldsymbol u u into the zero vector. See Figure 1.
      在这里插入图片描述
  • From this new point of view, solving the equation A x = b A\boldsymbol x=\boldsymbol b Ax=b amounts to finding all vectors x \boldsymbol x x in R 4 \mathbb R^4 R4 that are transformed into the vector b \boldsymbol b b in R 2 \mathbb R^2 R2 under the “action” of multiplication by A A A
  • The correspondence from x \boldsymbol x x to A x A\boldsymbol x Ax is a f u n c t i o n function function from one set of vectors to another.

  • A transformation (变换) (or function (函数) or mapping (映射)) T T T from R n \mathbb R^n Rn to R m \mathbb R^m Rm is a rule that assigns to each vector x \boldsymbol x x in R n \mathbb R^n Rn a vector T ( x ) T(\boldsymbol x) T(x) in R m \mathbb R^m Rm. The set R n \mathbb R^n Rn is called the domain (定义域) of T T T , and R m \mathbb R^m Rm is called the codomain (余定义域 / 取值空间) of T T T . For x \boldsymbol x x in R n \mathbb R^n Rn, the vector T ( x ) T(\boldsymbol x) T(x) in R m \mathbb R^m Rm is called the image (像) of x \boldsymbol x x (under the action of T T T ). The set of all images T ( x ) T(\boldsymbol x) T(x) is called the range (值域) of T T T . See Figure 2.
    在这里插入图片描述

Note that the range of T T T is the set of all linear combinations of the columns of A A A

Matrix Transformations

矩阵变换

EXAMPLE 2 (投影变换)

  • If A = [ 1 0 0 0 1 0 0 0 0 ] A=\begin{bmatrix}1&0&0\\0&1&0\\0&0&0\end{bmatrix} A=100010000, then the transformation x ↦ A x \boldsymbol x\mapsto A\boldsymbol x xAx projects points in R 3 \mathbb R^3 R3 onto the x 1 x 2 x_1x_2 x1x2-plane because
    在这里插入图片描述在这里插入图片描述

EXAMPLE 3

  • Let A = [ 1 3 0 1 ] A=\begin{bmatrix}1&3\\0&1\end{bmatrix} A=[1031]. The transformation T : R 2 ↦ R 2 T:\mathbb R^2\mapsto \mathbb R^2 T:R2R2 is called a shear transformation (剪切变换). T T T deforms the square as if the top of the square were pushed to the right while the base is held fixed.

在这里插入图片描述

Linear Transformations

  • Theorem 5 in Section 1.4 shows that if A A A is m × n m \times n m×n, then the transformation x → A x \boldsymbol x\rightarrow A\boldsymbol x xAx has the properties
    在这里插入图片描述These properties identify the most important class of transformations in linear algebra.

DEFINITION

在这里插入图片描述

  • Every matrix transformation is a linear transformation. Important examples of linear transformations that are not matrix transformations will be discussed in Chapters 4 and 5.

线性相关与线性变换的联系:如果 { v 1 , v 2 , v 3 \boldsymbol v_1,\boldsymbol v_2,\boldsymbol v_3 v1,v2,v3} 线性相关, T T T 为一线性变换,则 { T ( v 1 ) , T ( v 2 ) , T ( v 3 ) T(\boldsymbol v_1),T(\boldsymbol v_2),T(\boldsymbol v_3) T(v1),T(v2),T(v3)} 线性相关


  • Property (i) says that the result T ( u + v ) T(\boldsymbol u+\boldsymbol v) T(u+v) of first adding u \boldsymbol u u and v \boldsymbol v v in R n \mathbb R^n Rn and then applying T T T is the same as first applying T T T to u \boldsymbol u u and to v \boldsymbol v v and then adding T ( u ) T(\boldsymbol u) T(u) and T ( v ) T(\boldsymbol v) T(v) in R m \mathbb R^m Rm. These two properties lead easily to the following useful facts:

在这里插入图片描述

如果从群论角度来看,由性质 (i) 可知, T T T 实际上可以看作群 < R n , + > <\mathbb R^n,+> <Rn,+> 到群 < R m , + > <\mathbb R^m,+> <Rm,+> 的一个群同态映射,群同态映射保持幺元,从而得到了上面的性质 (3)

  • Observe that if a transformation satisfies (4) for all u \boldsymbol u u, v \boldsymbol v v and c , d c,d c,d, it must be linear. (Set c = d = 1 c = d = 1 c=d=1 for preservation of addition, and set d = 0 d = 0 d=0 for preservation of scalar multiplication.)
  • Repeated application of (4) produces a useful generalization:
    在这里插入图片描述
    • In engineering and physics, (5) is referred to as a s u p e r p o s i t i o n   p r i n c i p l e superposition\ principle superposition principle (叠加原理). Think of v 1 , . . . , v p \boldsymbol v_1,...,\boldsymbol v_p v1,...,vp as signals that go into a system and T ( v 1 ) , . . . , T ( v p ) T(\boldsymbol v_1),...,T(\boldsymbol v_p) T(v1),...,T(vp) as the responses of that system to the signals. The system satisfies the superposition principle if whenever an input is expressed as a linear combination of such signals, the system’s response is the same linear combination of the responses to the individual signals.

Affine transformation (仿射变换)

  • An affine transformation T : R n → R m T:\mathbb R^n \rightarrow \mathbb R^m T:RnRm has the form T ( x ) = A x + b T(\boldsymbol x)=A\boldsymbol x+\boldsymbol b T(x)=Ax+b, with A A A an m × n m \times n m×n matrix and b \boldsymbol b b in R m \mathbb R^m Rm. T T T is not a linear transformation when b ≠ 0 \boldsymbol b \neq \boldsymbol 0 b=0.

EXAMPLE 4

  • A company manufactures two products, B B B and C C C. We construct a “unit cost” matrix, whose columns describe the “costs per dollar of output” for the products:
    在这里插入图片描述
  • Let x = ( x 1 , x 2 ) \boldsymbol x = (x_1,x_2) x=(x1,x2) be a “production” vector, corresponding to x 1 x_1 x1 dollars of product B and x 2 x_2 x2 dollars of product C, and define T : R 2 → R 3 T:\mathbb R^2\rightarrow \mathbb R^3 T:R2R3 by
    在这里插入图片描述
  • The mapping T T T transforms a list of production quantities (measured in dollars) into a list of total costs. The linearity of this mapping is reflected in two ways:
    • If production is increased by a factor of, say, 4, from x x x to 4 x 4x 4x, then the costs will increase by the same factor, from T ( x ) T(\boldsymbol x) T(x) to 4 T ( x ) 4T(\boldsymbol x) 4T(x).
    • If x \boldsymbol x x and y \boldsymbol y y are production vectors, then the total cost vector associated with the combined production x + y \boldsymbol x+\boldsymbol y x+y is precisely the sum of the cost vectors T ( x ) T(\boldsymbol x) T(x) and T ( y ) T(\boldsymbol y) T(y).
  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Preface I wrote this book to help machine learning practitioners, like you, get on top of linear algebra, fast. Linear Algebra Is Important in Machine Learning There is no doubt that linear algebra is important in machine learning. Linear algebra is the mathematics of data. It’s all vectors and matrices of numbers. Modern statistics is described using the notation of linear algebra and modern statistical methods harness the tools of linear algebra. Modern machine learning methods are described the same way, using the notations and tools drawn directly from linear algebra. Even some classical methods used in the field, such as linear regression via linear least squares and singular-value decomposition, are linear algebra methods, and other methods, such as principal component analysis, were born from the marriage of linear algebra and statistics. To read and understand machine learning, you must be able to read and understand linear algebra. Practitioners Study Linear Algebra Too Early If you ask how to get started in machine learning, you will very likely be told to start with linear algebra. We know that knowledge of linear algebra is critically important, but it does not have to be the place to start. Learning linear algebra first, then calculus, probability, statistics, and eventually machine learning theory is a long and slow bottom-up path. A better fit for developers is to start with systematic procedures that get results, and work back to the deeper understanding of theory, using working results as a context. I call this the top-down or results-first approach to machine learning, and linear algebra is not the first step, but perhaps the second or third. Practitioners Study Too Much Linear Algebra When practitioners do circle back to study linear algebra, they learn far more of the field than is required for or relevant to machine learning. Linear algebra is a large field of study that has tendrils into engineering, physics and quantum physics. There are also theorems and derivations for nearly everything, most of which will not help you get better skill from or a deeper understanding of your machine learning model. Only a specific subset of linear algebra is required, though you can always go deeper once you have the basics.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值