李宏毅2020机器学习【学习笔记】 P12Brief Intro of DP

_bh

已于 2023-09-04 23:39:43 修改

阅读量809

点赞数 1

分类专栏：机器学习【学习笔记】文章标签：机器学习学习笔记

于 2023-09-01 15:20:27 首次发布

本文链接：https://blog.csdn.net/weixin_51330846/article/details/132609657

版权

机器学习【学习笔记】专栏收录该内容

14 篇文章 1 订阅

订阅专栏

Ups and downs of Deep Learning ：深度学习跌宕起伏的历史

Three Steps of Deep Learning

Step1： Function Set, Neural Network!

Fully Connect Feedforward Network 全连接前馈网络

Matrix Operation 矩阵运算

So the meaning of the NN：

Example ：

FAQ

Step2： Goodness ( Loss )

Step3： GD

Backpropagation

A Universality Theorem

感谢B站up主搬运的课程：

【李宏毅2020机器学习深度学习(完整版)国语】 https://www.bilibili.com/video/BV1JE411g7XF/?share_source=copy_web&vd_source=262e561fe1b31fc2fea4d09d310b466d

Ups and downs of Deep Learning ：深度学习跌宕起伏的历史

1958： Perceptron 感知机（ linear model ）
1969： Perceptron has limitation 线性模型的局限性
1980s： Multi-layer perceptron 多层感知机：与今天的DNN并没有太大的差别
1986： Backpropgation 反向传播：但超过3个隐藏层的神经网络，仍训练不出好结果
1989： 1 hidden layer is "good enough", why deep?
2006： RBM initialization 受限玻尔兹曼机初始化（ breakthrough? useless! )
2009：GPU 加速矩阵运算
2011： Deep Learning 用于语音识别
2012： Deep Learning 打赢了图像识别的比赛 ILSVRC

Three Steps of Deep Learning

Step1： Function Set, Neural Network!

我们可以用不同的方式连接 Neuron ，这样就有了不同的 Structure；

每一个 Neuron 的 Logistic Regression 都有各自的 weight $w$ 、 bias $b$ ，所有 Neuron 的参数合起来就是该 Network 的 Parameter $\theta$

Structure 其实是由自己设计的，一种常见的连接方式是——

Fully Connect Feedforward Network 全连接前馈网络

看一个 example（我们假设每一个 Neroun 都是 sigmoid function ）

像上图这样每个参数都已经确定的 Network ，就是一个 Function

Like this undefined parameters Structure in the picture above，it's actually Function Set ！

Fully Connect ： layer 和 layer 间每一个 Neuron 都两两连接

Feedforward ：向前传播

Input Layer 、 Hidden Layer 、 Output Layer

Matrix Operation 矩阵运算

Neural Network 的运算，我们通常用 Martrix Operation 来表示~

放到NN里看

So the meaning of the NN：

中间的 Hidden Layers 作为 Feature Extractor 特征提取器 取代了之前的 Feature Engineering 人工选择特征 ；

可以将 Hidden Layers 的输出想象成新的一组特征，则可以将 Output Layer 视为 Multi-class Classifier 的输出，所以 Output Layer 的 Neuron 通常也搞成 Softmax 函数~

Example ：

input 256 dimension vector feature

output 10 dimension vector ( each dimension represents the confidence of a digit )

You need to decide the network structure to let the good function in your function set.

FAQ

How many layers & neurons for each layers?
Trial & Error 试错 + intuition 直觉
DP perform well on image/speech recognition （more easy to design NN structure)
not well on NLP
Automatically determining the structure?
There is no universal method yet.
Design the structure by ourself( not fully connected )?
Of course, such as CNN.