Machine Learning By Andrew Ng (5)

最新推荐文章于 2024-06-26 09:48:29 发布

我是全宇宙ENERGE的总量

最新推荐文章于 2024-06-26 09:48:29 发布

阅读量339

点赞数

分类专栏：机器学习笔记

本文链接：https://blog.csdn.net/weixin_43038346/article/details/98885954

版权

机器学习同时被 2 个专栏收录

11 篇文章 0 订阅

订阅专栏

笔记

7 篇文章 0 订阅

订阅专栏

Notes on Machine Learning By Andrew Ng (5)

Click here to see previous note.

Neural Networks: Representation

Non-linear hypotheses

Non-linear classification

You may use polynomial features to find an ideal classifier, but when we have lots of features, it may comes to overfitting in the end.

Neurons and the brain

$[外链图片转存失败(img-w34Plrj9-1565271065273)(C:\Users\chenh\Desktop\Notebook\Machine Learning\pictures/1564234273529.png)]$

Model representation I

Neuron model: Logistic unit

在这里插入图片描述

Notation

$a_i^{(j)} =$ “activation” of unit $i$ in layer $j$ .

$\Theta^{(j)} = $ matrix of weight controlling function mapping from layer $j$ to layer $j + 1$ .

$[外链图片转存失败(img-L6PCdnb5-1565271065275)(C:\Users\chenh\Desktop\Notebook\Machine Learning\pictures\1564317611505.png)]$
$a_1^{(2)} = g(\Theta_{10}^{(1)}x_0 + \Theta_{11}^{(1)}x_1 + \Theta_{12}^{(1)}x_2 + \Theta_{13}^{(1)}x_3)\\ a_2^{(2)} = g(\Theta_{20}^{(1)}x_0 + \Theta_{21}^{(1)}x_1 + \Theta_{22}^{(1)}x_2 + \Theta_{23}^{(1)}x_3)\\ a_3^{(2)} = g(\Theta_{30}^{(1)}x_0 + \Theta_{31}^{(1)}x_1 + \Theta_{32}^{(1)}x_2 + \Theta_{33}^{(1)}x_3)\\ h_\Theta(x) = a_1^{(3)} = g(\Theta_{10}^{(2)}a_0^{(2)} + \Theta_{11}^{(2)}a_1^{(2)} + \Theta_{12}^{(2)}a_2^{(2)} + \Theta_{13}^{(2)}a_3^{(2)})$
If network has $s_j$ units in layer $j$ , $s_j$ units in layer $j + 1$ , then $\Theta^{(j)}$ will be of dimension $s_{j+1} \times (s_j+1)$ .

Model representation II

Forward propagation（前向传播）: Vertorized implementation

Let $\Theta_{10}^{(1)}x_0 + \Theta_{11}^{(1)}x_1 + \Theta_{12}^{(1)}x_2 + \Theta_{13}^{(1)}x_3 = z^{(2)}_1$ and $a_1^{(2)} = g(z^{(2)}_1)$ .

Turn it to a vector!
$\mathbf{x} = \left[ \begin{matrix} x_0\\ x_1\\ x_2\\ x_3 \end{matrix} \right]\quad \mathbf{z}^{(2)} = \left[ \begin{matrix} z_1^{(2)}\\ z_2^{(2)}\\ z_3^{(2)} \end{matrix} \right],\\ z^{(2)} = \Theta^{(1)}x\quad (x = a^{(1)})\\ a^{(2)} = g(z^{(2)}).$
Add $a_0^{(2)} =1$ , $z^{(3)} = \Theta^{(2)}a^{(2)}$ , $h_\Theta(x) = a^{(3)} = g(z^{(3)})$ .

Examples and intuitions I

Examples and intuitions II

$[外链图片转存失败(img-5N1I7ZkJ-1565271065277)(C:\Users\chenh\Desktop\Notebook\Machine Learning\pictures\1565269967079.png)]$

Multi-class classification

Multiple output units: One-vs-all

在这里插入图片描述

Training set: $x^{(1)}, y^{(1)}), (x^{(2)}, y^{(2)}), (x^{(m)}, y^{(m)})$ ,

$y^{(i)} \in \left[\begin{matrix}1\\0\\0\\0\end{matrix}\right], \left[\begin{matrix}0\\1\\0\\0\end{matrix}\right], \left[\begin{matrix}0\\0\\1\\0\end{matrix}\right],\left[\begin{matrix}0\\0\\0\\1\end{matrix}\right]$ .