Machine Learning（李宏毅公开课笔记）-Machine Learning and Deep Learning

最新推荐文章于 2022-11-06 21:52:49 发布

cx_0401

最新推荐文章于 2022-11-06 21:52:49 发布

阅读量459

点赞数

分类专栏：李宏毅课程笔记

本文链接：https://blog.csdn.net/qq_40438523/article/details/116046689

版权

李宏毅课程笔记专栏收录该内容

11 篇文章 0 订阅

订阅专栏

Machine Learning（李宏毅）

Machine Learning and Deep Learning

Machine Learning and Deep Learning

1. Functions

Regression：PM2.5
Classification：chess
Others：structured learning

2. The procedures of finding the functions

Functions with unknown parameters
Define loss from training data
Optimization
1. gradient descent
  A) randomly set an initial value w
  B) compute the $\partial l/\partial w$
  C) update w iteratively

3. Models

linear model
sophisticated model

linear curves:
all piecewise linear curves = constant + sum of set (sigmoid)
activation function:
1.hard sigmoid: which can be represented by sum of two ReLU

2.rectified linear unit(ReLU): $m a x (0, w x + b)$

3.soft sigmoid: $\cfrac{c}{1+e^{-(wx+b)}}=c*sigmoid(wx+b)$

Beyond piecewise curves

approximate continuous curve by a piecewise linear curve
to have a good approximate, we need sufficient pieces

New model: More Features
$\sum_{i}{c_i * sigmoid(\sum_{j}w_{ij}x_j+b_i)}$
$r_i = W_i X+b_i ，a_i=sigmoid(ri)$
$y = b + C A$
optimization of new model:
$\varTheta = [W B C]$
$\begin{vmatrix} \cfrac{\partial L}{\partial\varTheta_1} \\ \cfrac{\partial L}{\partial\varTheta_2} \\...\\\cfrac{\partial L}{\partial\varTheta_n} \end{vmatrix}$
$\nabla{L(\varTheta^0)}$
$\begin{vmatrix}\varTheta_1^1 \\ \varTheta_2^1\\...\\\varTheta_n^1 \end{vmatrix}=\begin{vmatrix} \varTheta_1^0 \\ \varTheta_2^0\\...\\\varTheta_n^0 \end{vmatrix} - \eta * g$