今天开始重新学习线性代数,从数据分析、机器学习的角度,争取今后可以看懂专业论文。其实,大学时光是学过的,但加上遗忘和本身是从市政工程的角度学的,目前只记得一些基本的操作,更深的就忘记了。
Introduction to Linear Algebar (Fifth edition) written by Dr. Gilber Strang.
The really impressive feature of linear algebra is how smoothly is takes that step into n-dimensional space. Your mental picture stays completely correct, even if drawing a ten-dimensional vector is impossible.
第二句话提供了想象高维空间操作的方法,即从二维、三维向量操作的空间想象,来窥视高维空间的操作。简单的,应该可以取几个高维向量的前2~3个tuples的组成的向量,想象加法操作,来进一步读懂高维的相关操作。这样就不会被挡在高维空间无法想象这个障碍前面了,这是一个线性代数的重要思想方法。
1.1 Vectors and Linear Combinations
- The combinations cu + dv + ew fill three-dimensional space. But if the w happens to be cu + dv, that third vector w is in the plane of the first two. The combinations of u, v , w will not go outside that uv plane. We do not get the full three-dimensional space.
-
I
R
n
{\rm I\!R}^n
IRn is the set of all n-tuples of real numbers.
这解释了 I R {\rm I\!R} IR代表的是Real numbers. - In applying mathematics, many problems have two parts:
(1) Modelling part Express the problem by a set of equations.
(2) Computational part Solve those equations by a fast and accurate algorithm.
Find n n n numbers c 1 , . . . , c n c_1,...,c_n c1,...,cn so that c 1 v 1 + . . . + c n v n = b c_1\bm{v}_1 + ... + c_n\bm{v}_n = \bm{b} c1v1+...+cnvn=b.
The elimination method in Chapter 2 succeeds far beyond n = 1000. For n greater than 1 billion, see Gaussian elimination method in Chapter 11. - When climbing to Chapter 6, remember to use the new method of calculating eigenvectors which is from three physicists. The method can be found at Neutrinos Lead to Unexpected Discovery in Basic Math.
学习感悟:慢慢来,弄明白,当小说,深修行。
1.2 Lengths and Dot Products
Dot Product or Inner Product can be used to describe the weight balance and the total income from trading.
In Microsoft Excel, the function sumproduct
can be used to calculate dot product.
Definition:
The length of
∥
v
∥
\|\bm{v}\|
∥v∥ of a vector is the square root of
v
⋅
v
:
\bm{v}\cdot\bm{v}:
v⋅v:
length =
∥
v
∥
=
v
⋅
v
=
(
v
1
2
+
v
2
2
+
⋯
+
v
n
2
)
1
/
2
\|\bm{v}\| = \sqrt{\bm{v}\cdot\bm{v}}=(v_1^2 + v_2^2 + \dots + v_n^2)^{1/2}
∥v∥=v⋅v=(v12+v22+⋯+vn2)1/2
Definition:
A unit vector
u
\bm{u}
u is a vector whose length equals one. Then
u
⋅
u
=
1
\bm{u}\cdot\bm{u} = 1
u⋅u=1
According to the Q&A of Quora, the unit vector can be used for the fast calculation among vectors by converting vectors to the line combinations of perpendicular unit vectors. The transformation can separate the scalar calculation and the directionality.
Unit Vector:
u
=
v
∥
v
∥
u = \frac{v}{\|v\|}
u=∥v∥v is a unit vector in the same direction as
v
v
v.
Right angles:
The dot product is
v
⋅
w
=
0
v \cdot w = 0
v⋅w=0 when
v
v
v is perpendicular to
w
w
w.
Unit vectors u u u and U U U at angle θ \theta θ have u ⋅ U = c o s θ . u \cdot U=cos\theta. u⋅U=cosθ. Certainly ∣ u ⋅ U ∣ ≤ 1. |u \cdot U|\leq1. ∣u⋅U∣≤1.
The dot product reveals the exact angle θ \theta θ.
c o s θ = v ⋅ w ∥ v ∥ ∥ w ∥ cos\theta = \frac{\bm{v}\cdot\bm{w}}{\|\bm{v}\| \|\bm{w}\|} cosθ=∥v∥∥w∥v⋅w
Schwarz inequality(
∣
v
⋅
w
∣
≤
∥
v
∥
∥
w
∥
|v \cdot w|\leq\|v\|\|w\|
∣v⋅w∣≤∥v∥∥w∥), more correctly named as Cauchy-Schwarz-Buniakowsky inequality which was found in France and Germany and Russis, is the most important inequality in mathematics.
Triangle Inequality: ∥ v + w ∥ ≤ ∥ v ∥ + ∥ w ∥ \|v+w\|\leq\|v\|+\|w\| ∥v+w∥≤∥v∥+∥w∥
geometric mean
≤
\leq
≤ arithmetic mean:
x
y
≤
x
+
y
2
\sqrt{xy} \leq \frac{x+y}{2}
xy≤2x+y
x
y
z
3
≤
x
+
y
+
z
3
\sqrt[3]{xyz} \leq \frac{x+y+z}{3}
3xyz≤3x+y+z(How to prove it???)
Problem 1:
Pick any numbers that add to
x
+
y
+
z
=
0
x + y + z = 0
x+y+z=0. Find the angle between your vector
v
=
(
x
,
y
,
z
)
v = (x, y, z)
v=(x,y,z) and the vector
w
=
(
z
,
x
,
y
)
w = (z, x, y)
w=(z,x,y). Challenge question: Explain why
v
⋅
w
∥
v
∥
∥
w
∥
\frac{v \cdot w}{\|v\|\|w\|}
∥v∥∥w∥v⋅w is always
−
1
2
-\frac{1}{2}
−21.
Step 1:
(
x
+
y
+
z
)
2
=
x
2
+
y
2
+
z
2
+
2
x
y
+
2
x
z
+
2
y
z
=
0
⇒
x
2
+
y
2
+
z
2
=
−
2
(
x
y
+
x
z
+
y
z
)
(x + y + z)^2 = x^2+y^2+z^2+2xy+2xz+2yz = 0 \Rightarrow x^2+y^2+z^2=-2(xy+xz+yz)
(x+y+z)2=x2+y2+z2+2xy+2xz+2yz=0⇒x2+y2+z2=−2(xy+xz+yz)
Step 2:
v
⋅
w
∥
v
∥
∥
w
∥
=
x
z
+
x
y
+
y
z
x
2
+
y
2
+
z
2
x
2
+
y
2
+
z
2
=
x
z
+
x
y
+
y
z
x
2
+
y
2
+
z
2
=
x
z
+
x
y
+
y
z
−
2
(
x
y
+
x
z
+
y
z
)
=
−
1
2
\frac{v \cdot w}{\|v\|\|w\|}=\frac{xz+xy+yz}{\sqrt{x^2+y^2+z^2}\sqrt{x^2+y^2+z^2}}=\frac{xz+xy+yz}{x^2+y^2+z^2}=\frac{xz+xy+yz}{-2(xy+xz+yz)}=-\frac{1}{2}
∥v∥∥w∥v⋅w=x2+y2+z2x2+y2+z2xz+xy+yz=x2+y2+z2xz+xy+yz=−2(xy+xz+yz)xz+xy+yz=−21
Problem 2:
Using
v
=
r
a
n
d
n
(
3
,
1
)
v = randn(3,1)
v=randn(3,1) in MATLAB, create a random unit vector
u
=
v
∥
v
∥
u = \frac{v}{\|v\|}
u=∥v∥v. Using
V
=
r
a
n
d
n
(
3
,
30
)
V=randn(3,30)
V=randn(3,30) create 30 more random unit vectors
U
j
U_j
Uj. What is the average size of the dot products
∣
u
⋅
U
j
∣
|u \cdot U_j|
∣u⋅Uj∣? In calculus, the average is
∫
0
π
∣
c
o
s
θ
∣
d
θ
/
π
=
2
/
π
\int_0^\pi{|cos\theta|d_\theta/\pi}=2/\pi
∫0π∣cosθ∣dθ/π=2/π.
In the OCTAVE, I did not correctly answer this question. Because I have not use the MATLAB or OCTAVE for long time, so I tried to answer this question using numpy
. The following code is:
>>> import numpy as np
>>> v = np.random.random((1,3))
>>> u = v/np.linalg.norm(v,2)
>>> V = np.random.random((30,3))
>>> D = np.apply_along_axis(np.linalg.norm, 1, V, 2).reshape(30,1)
>>> U = V/D
>>> np.average(np.abs(U.dot(u[0])))
0.8702124234966323
>>> 2/np.pi
0.6366197723675814
I do not know whether my answer is right…
1.3 Matrices
A crucial change in viewpoint about the combination of the vectors and the matrix times vector(Combination of columns).
x
1
[
1
−
1
0
]
+
x
2
[
0
1
−
1
]
+
x
3
[
0
0
1
]
=
[
x
1
x
2
−
x
1
x
3
−
x
2
]
=
[
1
0
0
−
1
1
0
0
−
1
1
]
[
x
1
x
2
x
3
]
x_1\left[\begin{matrix}1\\-1\\0\end{matrix}\right] + x_2\left[\begin{matrix}0\\1\\-1\end{matrix}\right] + x_3\left[\begin{matrix}0\\0\\1\end{matrix}\right]=\left[\begin{matrix}x_1\\x_2-x_1\\x_3-x_2\end{matrix}\right] = \left[\begin{matrix}1 & 0 & 0\\-1 & 1& 0\\0 & -1 & 1\end{matrix}\right]\left[\begin{matrix}x_1\\x_2\\x_3\end{matrix}\right]
x1⎣⎡1−10⎦⎤+x2⎣⎡01−1⎦⎤+x3⎣⎡001⎦⎤=⎣⎡x1x2−x1x3−x2⎦⎤=⎣⎡1−1001−1001⎦⎤⎣⎡x1x2x3⎦⎤
x
1
u
+
x
2
v
+
x
3
w
=
A
x
x_1\bm{u}+x_2\bm{v}+x_3\bm{w} = A\bm{x}
x1u+x2v+x3w=Ax
A
x
A\bm{x}
Ax is also dot products with rows:
A
x
=
[
1
0
0
−
1
1
0
0
−
1
1
]
[
x
1
x
2
x
3
]
=
[
(
1
,
0
,
0
)
⋅
(
x
1
,
x
2
,
x
3
)
(
−
1
,
1
,
0
)
⋅
(
x
1
,
x
2
,
x
3
)
(
0
,
−
1
,
1
)
⋅
(
x
1
,
x
2
,
x
3
)
]
A\bm{x} = \left[\begin{matrix}1 & 0 & 0\\-1 & 1& 0\\0 & -1 & 1\end{matrix}\right]\left[\begin{matrix}x_1\\x_2\\x_3\end{matrix}\right]=\left[\begin{matrix}(1,0,0)\cdot(x_1,x_2,x_3)\\(-1,1,0)\cdot(x_1,x_2,x_3)\\(0,-1,1)\cdot(x_1,x_2,x_3)\end{matrix}\right]
Ax=⎣⎡1−1001−1001⎦⎤⎣⎡x1x2x3⎦⎤=⎣⎡(1,0,0)⋅(x1,x2,x3)(−1,1,0)⋅(x1,x2,x3)(0,−1,1)⋅(x1,x2,x3)⎦⎤
The difference matrix can find all the differences among the number in vector
x
\bm{x}
x.
[
1
0
0
−
1
1
0
0
−
1
1
]
[
x
1
x
2
x
3
]
=
[
x
1
x
2
−
x
1
x
3
−
x
2
]
\left[ \begin{matrix} \blue{1} & 0 & 0\\ \red{-1} & \blue{1}& 0\\ 0 & \red{-1} & \blue{1} \end{matrix} \right] \left[\begin{matrix}x_1\\x_2\\x_3\end{matrix}\right]=\left[\begin{matrix}x_1\\x_2-x_1\\x_3-x_2\end{matrix}\right]
⎣⎡1−1001−1001⎦⎤⎣⎡x1x2x3⎦⎤=⎣⎡x1x2−x1x3−x2⎦⎤
[ 1 0 0 0 − 1 1 0 0 0 − 1 1 0 0 0 − 1 1 ] [ x 1 x 2 x 3 x 4 ] = [ x 1 x 2 − x 1 x 3 − x 2 x 4 − x 3 ] \left[ \begin{matrix} \blue{1} & 0 & 0 & 0\\ \red{-1} & \blue{1} & 0 & 0\\ 0 & \red{-1} & \blue{1} & 0\\ 0 & 0 & \red{-1} & \blue{1} \end{matrix} \right] \left[\begin{matrix}x_1\\x_2\\x_3\\x_4\end{matrix}\right]=\left[\begin{matrix}x_1\\x_2-x_1\\x_3-x_2\\x_4-x_3\end{matrix}\right] ⎣⎢⎢⎡1−10001−10001−10001⎦⎥⎥⎤⎣⎢⎢⎡x1x2x3x4⎦⎥⎥⎤=⎣⎢⎢⎡x1x2−x1x3−x2x4−x3⎦⎥⎥⎤
C
x
=
b
:
Cx=b:
Cx=b:
[
x
1
−
x
3
x
2
−
x
1
x
3
−
x
2
]
=
[
1
3
5
]
\left[ \begin{matrix} x_1 - x_3\\ x_2 - x_1\\ x_3 - x_2 \end{matrix} \right] =\left[\begin{matrix}1\\3\\5\end{matrix}\right]
⎣⎡x1−x3x2−x1x3−x2⎦⎤=⎣⎡135⎦⎤
Left sides add to 0
Right sides add to 9
No solution
x
1
x_1
x1,
x
2
x_2
x2,
x
3
x_3
x3
Independence and Dependence
Independent columns
⇒
\Rightarrow
⇒ Invertable matrix
Dependent columns
⇒
\Rightarrow
⇒ Singular matrix
u = [ 1 − 1 0 ] u = \left[\begin{matrix} 1\\-1\\0 \end{matrix}\right] u=⎣⎡1−10⎦⎤ v = [ 0 1 − 1 ] v=\left[\begin{matrix} 0\\1\\-1 \end{matrix}\right] v=⎣⎡01−1⎦⎤ w = [ 0 0 1 ] w = \left[\begin{matrix} 0\\0\\1 \end{matrix}\right] w=⎣⎡001⎦⎤ w ∗ = [ − 1 0 1 ] w^* = \left[\begin{matrix} -1\\0\\1 \end{matrix}\right] w∗=⎣⎡−101⎦⎤
A = [ u , v , w ] A=[u, v, w] A=[u,v,w] C = [ u , v , w ∗ ] C=[u,v,w^*] C=[u,v,w∗]
For matrix
A
A
A,
w
w
w is not in the plane of
u
u
u and
v
v
v. The sum of elements in
w
w
w and
v
v
v is all equal to 0, but the sum of elements in
w
w
w is equal to 1.
For matrix
C
C
C,
w
∗
w^*
w∗ is in the plane of
u
u
u and
v
v
v. The sum of elements in
w
w
w ,
v
v
v and
w
∗
w^*
w∗ is all equal to 0
u
,
v
,
w
u, v, w
u,v,w are independent . No combination except
0
u
+
0
v
+
0
w
=
0
0u + 0v + 0w = 0
0u+0v+0w=0 gives
b
=
0.
b = 0.
b=0.
u
,
v
,
w
∗
u, v, w^*
u,v,w∗ are dependent . Other combinations like
u
+
v
+
w
∗
u + v + w^*
u+v+w∗ give
b
=
0.
b=0.
b=0.
Independent columns:
A
x
=
0
Ax=0
Ax=0 has one solution.
A
A
A is an invertible matrix.
Dependent columns:
C
x
=
0
Cx=0
Cx=0 has many solutions.
C
C
C is a singular matrix.