Mathematics for Machine Learning 学习笔记

最新推荐文章于 2021-09-24 09:19:16 发布

chrww

最新推荐文章于 2021-09-24 09:19:16 发布

阅读量1.4k

点赞数 2

分类专栏： # 机器学习文章标签：机器学习数学

本文链接：https://blog.csdn.net/chrww/article/details/109562569

版权

这篇博客是关于Coursera上的《Mathematics for Machine Learning》课程的学习笔记，主要涵盖了线性代数的基础概念如矢量运算、矩阵、基变换、正交基和特征分解，以及多元微积分的导数定义、泰勒展开式，最后探讨了主成分分析（PCA）的原理和步骤，包括线性变换对数据方差的影响、特征向量和特征值的计算以及PCA算法的关键步骤。

摘要由CSDN通过智能技术生成

学习Coursera上Mathematics for Machine Learning Specialization后所做的笔记与整理

文章目录

第一部分 Linear Algebra 线性代数
第二部分 Multivariate 多元微积分
第三部分 PCA (Principal Component Analysis) 主成分分析

第一部分 Linear Algebra 线性代数

1. Vector operations 矢量运算

$\text{commutative 交换律:} \quad r + s = s + r$
$2 r = r + r$
$\|r\|^2 = \sum_{i} r_i^2$

1.1 dot or inner product 点积/数量积/内积

点积是一种特殊的内积
$\cdot s = \sum_{i} r_i s_i$

$\text{commutative 交换律:} \quad r \cdot s = s \cdot r$
$\text{distributive 分配律:} \quad r \cdot (s + t) = r \cdot s + r \cdot t$
$\text{associative 结合律} \quad r \cdot (a s) = a(r \cdot s)$
$\cdot r = \|r\|^2$
$\cdot s = \|r\| \|s\| \cos \theta$

1.2 scalar and vector projection 投影

scalar projection 投影/标量投影
例：向量s在向量r上的投影 $\frac{r \cdot s}{\|r\|}$
vector projection 矢量投影
例：向量s在向量r上的投影 $\frac{r \cdot s} {r \cdot r} r$

2. basis 基

A basis is a set of $n$ vectors that:

are not linear combinations of each other
span the space
The Space is then n-dimensional.

在线性空间 $V$ 中，如果存在 $n$ 个元素 $a_1,a_2,\dots,a_n$ ，满足：

$a_1,a_2,\dots,a_n$ 线性无关；
$V$ 中任一元素 $a$ 总可由 $a_1,a_2,\dots,a_n$ 线性表示。

那么， $a_1,a_2,\dots,a_n$ 就称为线性空间 $V$ 的一个基， $n$ 称为线性空间 $V$ 的维数。只含有一个零元素的线性空间没有基，规定它的维数为0.
维数为 $n$ 的线性空间称为 $n$ 维线性空间，记作 $V_n$ 。
（同济大学线性代数第五版第六章第二节）

3. Matrices 矩阵

由 $\times n$ 个数 $a_{ij}(i=1,2,\cdots,m;j=1,2,\dots,n)$ 排成的 $m$ 行 $n$ 列的数表称为 $m$ 行 $n$ 列矩阵，简称 $\times n$ 矩阵。记作
$\begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{pmatrix}$
（同济大学线性代数第五版第二章第一节）

矩阵与数相乘

同矩阵间的加法合起来，统称为矩阵的线形运算

commutative 交换律
$\lambda A=A \lambda$
distributive 分配律
$(\lambda + \mu)A =\lambda A + \mu A$
$\lambda(A + B) = \lambda A + \lambda B$
associative 结合律
$(\lambda \mu) A = \lambda (\mu A)$

矩阵与向量相乘

$\begin{bmatrix} a & b \\ c & d \end{bmatrix} \begin{bmatrix} e \\ f \end{bmatrix} = \begin{bmatrix} ae + bf \\ ce + df \end{bmatrix}$

向量与矩阵相乘可以理解为: 向量 $r$ 经过矩阵A变换为 $\quad Ar=r'$
$A (n r) = n (A r) = n r^{'}$ $\quad$ （ $n$ 是常数）
分配律 $\quad A(r + s) = Ar + As$
Identity 单位矩阵
$\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}$
clockwise rotation by $\theta$ 顺时针旋转 $\theta$ 角度
$\begin{bmatrix} \cos \theta & \sin \theta \\ - \sin \theta & \cos \theta \end{bmatrix}$
determinant of 2×2 matrix 行列式
$\begin{bmatrix} a & b \\ c & d \end{bmatrix} = ad - bc$
inverse of 2×2 matrix 逆矩阵
$\begin{bmatrix} a & b \\ c & d \end{bmatrix}^{-1} = \frac {1} {ad -bc} \begin{bmatrix} d & -b \\ -c & a \end{bmatrix}$

矩阵与矩阵相乘

multiplying matrices $A$ and $B$ 矩阵相乘
$\in R^{m \times n}, B \in R^{n \times l}$
$\begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \dots & a_{mn} \end{bmatrix} \begin{bmatrix} b_{11} & b_{12} & \cdots & b_{1l} \\ b_{21} & b_{22} & \cdots & b_{2l} \\ \vdots & \vdots & \ddots & \vdots \\ b_{n1} & b_{n2} & \dots & b_{nl} \end{bmatrix} = \begin{bmatrix} c_{11} & c_{12} & \cdots & c_{1l} \\ c_{21} & c_{22} & \cdots & c_{2l} \\ \vdots & \vdots & \ddots & \vdots \\ c_{m1} & c_{m2} & \dots & c_{ml} \end{bmatrix} = C$
$c_{ik} = ab_{ik} = \sum_{j = 1}^n a_{ij}b_{jk}$
- Einstein summation convention for multiplying matrices $A$ and $B$
  $A B = C$
$AA^{-1}=A^{-1}A=I$ (A is square matrix, A is not invertible matrix)
不满足交换律
distributive 分配律
$A (B + C) = A B + A C$
$(B + C) A = B A + C A$
associative 结合律
$(A B) C = A (B C)$
$\lambda (AB) = (\lambda A)B = A(\lambda B)$ ( $\lambda$ 是常数)

4. change of basis 基变换/坐标变换

Change from an original basis to a new, primed basis. The columns of the transformation matrix $P$ are the new basis vectors in the original coordinate system. So
Let $\alpha (a_1,a_2,\dots,a_n)$ denote the old basis, $\beta (b_1,b_2,\dots,b_n)$ denote the new basis. We can get,
$\beta = \alpha P$
or,
$\beta^T=P^T\alpha^T$
where $r^{'}$ is the vector in the new basis, and $r$ is the vector in the original basis.
$ $\alpha r = \beta r' = \alpha P r'$
$r' = P^{-1}r$
（参考同济大学线性代数第五版第六章第三节）

正交基

If a matrix $A$ is orthonormal (all the columns are of unit size and orthogonal to each other) then
矩阵 $A$ 是正交矩阵（正交阵）的充分必要条件是 $A$ 等列向量都是单位向量，且两两正交。
$A^T = A^{-1}$
即
$A^TA^{-1}=E$
即
$\begin{bmatrix} a_1^T \\ a_2^T \\ \dots \\ a_n^T \end{bmatrix} (a_1,a_2,\dots,a_n)=E$
也即
$(a_i^Ta_j) = (\delta_{ij})$
相当于 $n^2$ 个关系式
$a_i^Ta_j = \delta_{ij} = \begin{cases} 1 & \quad \text{when } i = j\\ 0 & \quad \text{when } i \neq j \end{cases}$
因为 $A^T=A^{-1}$ ,所以上述结论对 $A$ 的行向量亦成立。
（补充阅读：同济大学线性代数第五版第六章第三节）

5. Gram-Schmidt process for constructing an orthonormal basis 用格拉姆-施密特正交化构建正交基

Start with $n$ linearly independent basis vectors $\{ v_1,v_2,\dots,v_n \}$ . Then
$e_1 = \frac {v_1} {\|v_1\|}$
$u_2 = v_2 - \frac{v_2 \cdot e_1}{e_1 \cdot e_1}e_1= v_2 - (v_2 \cdot e_1)e_1$ , so $e_2 = \frac {u_2} {\|u_2\|}$
… and so on for $u_3$ being the remnant part of $v_3$ not composed of the preceding

最低0.47元/天解锁文章

chrww

关注

2
点赞
踩
21

收藏

觉得还不错? 一键收藏
9
评论
Mathematics for Machine Learning 学习笔记

学习Coursera上Mathematics for Machine Learning Specialization后所做的笔记与整理文章目录第一部分 Linear Algebra 线性代数1. Vector operations 矢量运算1.1 dot or inner product 点积/数量积/内积1.2 scalar and vector projection 投影2. basis 基3. Matrices 矩阵矩阵与向量相乘4. change of basis 基变换/坐标变换5. Gram-
复制链接

扫一扫

专栏目录