机器学习笔记 ---- Neural Networks

VampireWeekend

已于 2022-05-09 09:23:14 修改

阅读量245

点赞数

分类专栏： Machine Learning 文章标签：机器学习逻辑回归深度学习

于 2018-08-03 21:08:46 首次发布

本文链接：https://blog.csdn.net/sinat_35406909/article/details/81254347

版权

Machine Learning 专栏收录该内容

10 篇文章 0 订阅

订阅专栏

Neural Network

1. Model Summary

At a very simple level, neurons are basically computational units that take inputs (dendrites) as electrical inputs (called “spikes”) that are channeled to outputs (axons). In our model, our dendrites are like the input features $x_1\cdots x_n$ and the output is the result of our hypothesis function. In this model our $x_{0}=1$ input node is sometimes called the “bias unit.” It is always equal to 1. In neural networks, we use the same logistic function as in classification, $\frac{1}{1 + e^{-\theta^Tx}}$ , yet we sometimes call it a sigmoid (logistic) activation function. In this situation, our “theta” parameters are sometimes called “weights”.
Visually, a simplistic representation looks like:

$\begin{bmatrix} x_0 \\ x_1 \\ x_2 \end{bmatrix} →[ \qquad ]→h_θ(x)$

three layers: input layer / hidden layer / output layer
$a_{i}^{(j)}$ : activation unit i in layer j
$\Theta^{(j)}$ : Matrix that controls function mapping from j-th layer to (j+1)-th layer
If layer j has $s_{j}$ units, layer j+1 has $s_{j+1}$ units, then size of $\Theta^{(j)}$ is $s_{j+1}*(s_{j}+1)$
$L$ : Number of Layers
$s_l$ : Number of units in l-th layer
Number of Inputs: the dimension of features in $x^{(i)}$
Binary Classification: 1 output unit
K-classes Classification: K output unit

2. Forward Propagation

Add $a_x^{(0)}=1$ first
$z_{x+1}=\Theta^{(x)}a_x$
$a_{x+1}=g(z_{x+1})$ — g(x) : Sigmoid

3. Cost Function

Cost Function
Excluding Bias Term

4. Backpropagation Algorithm

$\delta_j^{(l)}$ error of node j in layer l, then
$\delta^{(L)}=a^{(L)}-y\\ \delta^{(i)}=(\Theta^{(i)})^T\delta^{(i+1)}.*g'(z^{(i)}) \qquad (i!=L,i!=1)$
where $g'(z^{(i)})=a^{(i)}.*(1-a^{(i)})$
这里写图片描述
One thing to note: use one training set to train the model at one time!