Neural Network Brief Intro+Softmax+2-layer nn Implementation

最新推荐文章于 2021-12-12 21:46:33 发布

Sycret

最新推荐文章于 2021-12-12 21:46:33 发布

阅读量119

点赞数

分类专栏：学习笔记文章标签：神经网络机器学习

本文链接：https://blog.csdn.net/qq_34131692/article/details/109777471

版权

学习笔记专栏收录该内容

11 篇文章 0 订阅

订阅专栏

Neural Network

1. Why Need NN?

-Mimic Brain
在这里插入图片描述
之前的人工经验（符号主义）已经不太行了所以采用仿生主义（连接主义）的方法

-Non Linear
在这里插入图片描述

Not possible by just combining features

2. Logistic Unit

Activation Function （非线性）
在这里插入图片描述
定义：
$x_0=1, \theta_0=1$ ，作为偏置, $g(\cdot)$ 为非线性激活函数（必须是非线性的，否则一顿加乘之后还是线性回归的结果，就白搞了）：
$h_\theta(x)=g(\theta_1x_1+\theta_2x_2+\theta_3x_3+\theta_0x_0)=\boldsymbol\theta^T\boldsymbol{x}$

3. Neural Network

Input Layer -> Hidden Layer -> Output Layer
在这里插入图片描述

（图中偏置都没有画出）上标表示神经元所在层数。每一个神经元的激活函数都可以选择不一样的。（但是只是可以，实际上还是用的一样的，没什么必要）

数学表达：
规定：
在这里插入图片描述
则有：

权重 $\boldsymbol{W}$ 是一个4行3列的矩阵，4是input的dim，3是神经元的dim。

example: AND OR NOT ;
在这里插入图片描述
XNOR异同，相同是1不同是0，单层感知机没办法实现。多层可以，因为
$A\bigodot B =AB+A'B'$ 就可以用3个逻辑单元去做。

用于Classification：
在这里插入图片描述
最后加一层softmax归一化概率。

4. Back Propagation

用来解决NN中的梯度下降问题。

多元函数的链式法则（来源于百度）：
在这里插入图片描述

4.1 Softmax相关

nn.Softmax(dim: Union[int, NoneType] = None) -> None
Docstring:
Applies the Softmax function to an n-dimensional input Tensor
rescaling them so that the elements of the n-dimensional output Tensor
lie in the range [0,1] and sum to 1.

Softmax is defined as:

$\text{Softmax}(x_{i}) = \frac{e^{x_i}}{\sum_j e^{x_j}}$

When the input Tensor is a sparse tensor then the unspecifed
values are treated as -inf.

Shape:
- Input: (*) where * means, any number of additional
dimensions
- Output: (*), same shape as the input

Returns:
a Tensor of the same dimension and shape as the input with
values in the range [0, 1]

Arguments:
dim (int): A dimension along which Softmax will be computed (so every slice
along dim will sum to 1).

… note::
This module doesn’t work directly with NLLLoss,
which expects the Log to be computed between the Softmax and itself.
Use LogSoftmax instead (it’s faster and has better numerical properties).

Softmax+CrossEntropy求导：

只写个结论吧，假设某一层输出 $z_1,z_2,z_3$ 经过softmax得到概率 $P_1,P_2,P_3$ , 损失函数使用CrossEntropy，
$\sum_{i=1}^m{y_ilog(P_i)}$
记为Loss，softmax结果中是第k项为1别的都是0，
即 $P_k=1,P_i=0(i\neq k)$ 那么结论如下：
当 $i = k$ 的时候， $\frac{\partial{Loss}}{\partial{z_i}}=P_i-1$ ,
当 $i\neq k$ 的时候， $\frac{\partial{Loss}}{\partial{z_i}}=P_i$

推导过程：先对CE求导，再对softmax分类讨论求导后裂项表示，再根据chain rule相乘得到结果。

Sycret

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Neural Network Brief Intro+Softmax+2-layer nn Implementation

目录Neural Network1. Why Need NN?2. Logistic Unit3. Neural Network4. Back PropagationNeural Network1. Why Need NN?-Mimic Brain连接主义（人工经验符号主义）-Non LinearNot possible by just combining features2. Logistic UnitActivation Function （非线性）3. Neural Net
复制链接

扫一扫