神经网络 (Neural Network)

1 模型

前向传播模型 (Forward Propogation)

设有 n n n个输入神经元(特征), K K K个输出神经元(分类), L L L层神经元 (不包括输入层),每层 s l s_{l} sl个神经元,第 l − 1 l-1 l1层到 l l l层的参数矩阵为 W ( l ) ∈ R s l − 1 × s l W(l)\in\mathbb{R}^{s_{l-1}\times s_{l}} W(l)Rsl1×sl,偏置项为 b ( l ) ∈ R 1 × s l b{(l)}\in\mathbb{R}^{1\times s_{l}} b(l)R1×sl,记 Θ = ( W , b ) \Theta=(W,b) Θ=(W,b).
L a y e r   0 :   a ( 0 ) = x L a y e r   1 :   z ( 1 ) = a T ( 0 ) W ( 1 ) + b ( 1 ) , a ( 1 ) = g 1 ( z ( 1 ) )    ⋮ L a y e r   l :   z ( l ) = a T ( l − 1 ) W ( l ) + b ( l ) , a ( l ) = g l ( z ( l ) )    ⋮ L a y e r   L :   z ( L ) = a T ( L − 1 ) W ( L ) + b ( L ) , y ^ = a ( L ) = g L ( z ( L ) ) \begin{aligned} Layer\ 0:\ &a(0) = \color{red}{x} \\ Layer\ 1:\ &z(1)=a^{T}(0)W(1)+b(1), \\ &a(1) = g_{1}\left(z(1)\right) \\ &\ \ \vdots \\ Layer\ l:\ &z(l)=a^{T}(l-1)W(l)+b(l), \\ &a(l) = g_{l}\left(z(l)\right)\\ &\ \ \vdots \\ Layer\ L:\ &z(L)=a^{T}(L-1)W(L)+b(L),\\ &{\color{red}{\hat{y}}}=a(L)= g_{L}\left(z(L)\right)\\ \end{aligned} Layer 0: Layer 1: Layer l: Layer L: a(0)=xz(1)=aT(0)W(1)+b(1),a(1)=g1(z(1))  z(l)=aT(l1)W(l)+b(l),a(l)=gl(z(l))  z(L)=aT(L1)W(L)+b(L),y^=a(L)=gL(z(L))
其中,
x = [ x ( 1 ) ⋮ x ( n ) ] , a ( l ) = [ a ( 1 ) ( l ) ⋮ a ( s l ) ( l ) ] W ( l ) = [ w 11 ( l ) ⋯ w 1 , s l ( l ) ⋮ ⋮ w s l − 1 , 1 ( l ) ⋯ w s l − 1 , s l ( l ) ] , l = 1 , ⋯   , L b ( l ) = [ b 1 ( l ) ⋯ b s l ( l ) ] , l = 1 , ⋯   , L \begin{aligned} x &= \left[\begin{matrix} x^{(1)}\\ \vdots \\x^{(n)} \end{matrix}\right], \quad a(l) = \left[\begin{matrix} a^{(1)}(l) \\ \vdots \\a^{(s_{l})}(l) \end{matrix}\right] \\\\ W(l) &= \left[\begin{matrix} w_{11}(l) & \cdots & w_{1,s_{l}}(l) \\ \vdots & & \vdots \\ w_{s_{l-1},1}(l) & \cdots & w_{s_{l-1},s_{l}}(l) \end{matrix}\right],\quad l=1,\cdots, L \\\\ b(l) &= \left[\begin{matrix} b_{1}(l) & \cdots & b_{s_{l}}(l) \end{matrix}\right],\quad l=1,\cdots, L \\\\ \end{aligned} xW(l)b(l)=x(1)x(n),a(l)=a(1)(l)a(sl)(l)=w11(l)wsl1,1(l)w1,sl(l)wsl1,sl(l),l=1,,L=[b1(l)bsl(l)],l=1,,L
g g g 称为激活函数,常见的激活函数有 R e L U s , t a n h , i g m o i d , s o f t m a x ReLUs,tanh, igmoid, softmax ReLUs,tanh,igmoid,softmax,通常隐藏层取 g ( z ) = R e L U ( z ) g(z)=ReLU(z) g(z)=ReLU(z),在二分类情况下输出层取 g ( z ) = s i g m o i d ( z ) g(z)=sigmoid(z) g(z)=sigmoid(z),在多分类情况下输出层取 g ( z ) = s o f t m a x ( z ) g(z)=softmax(z) g(z)=softmax(z).

  • R e L U ReLU ReLU

g ( z ) = m a x { 0 , z } , g ′ ( z ) = { 0 , z ≤ 0 1 , z > 0 g(z) = max\{0,z\},\quad g'(z) = \begin{cases}0, & z \leq 0 \\ 1, & z > 0 \end{cases} g(z)=max{ 0,z},g(z)={ 0,1,z0z>0

  • L e a k y   R e L u Leaky\ ReLu Leaky ReLu

g ( z ) = m a x { 0.01 z , z } , g ′ ( z ) = { 0.01 , z ≤ 0 1 , z > 0 g(z) = max\{0.01z,z\},\quad g'(z) = \begin{cases}0.01, & z \leq 0 \\ 1, & z > 0 \end{cases} g(z)=max{ 0.01z,z},g(z)={ 0.01,1,z0z>0

  • t a n h tanh tanh

g ( z ) = e z − e − z e z + e − z , g ′ ( z ) = 1 − g 2 ( z ) g(z) = \frac{e^{z}-e^{-z}}{e^{z}+e^{-z}}, \quad g'(z) = 1-g^{2}(z) g(z)=ez+ezezez,g(z)=1g2(z)

  • s i g m o i d sigmoid sigmoid (二分类, z z z 是标量即可)

g ( z ) = 1 1 + e − z , g ′ ( z ) = g ( z ) ( 1 − g ( z ) ) g(z) = \frac{1}{1+e^{-z}}, \quad g'(z) = g(z)(1-g(z)) g(z)=

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值