Shallow Neural Network Week 3

Single Sample

Symbols

X=x1xnx,Y=y1yny, X = ( x 1 ⋮ x n x ) , Y = ( y 1 ⋮ y n y ) ,
Z[l]=z[l]1z[l]nl,1lL Z [ l ] = ( z 1 [ l ] ⋮ z n l [ l ] ) , 1 ≤ l ≤ L
A[l]=a[l]1a[l]nl,A~[l]=a[l]0a[l]1a[l]nl=(1A[l]),0lL A [ l ] = ( a 1 [ l ] ⋮ a n l [ l ] ) , A ~ [ l ] = ( a 0 [ l ] a 1 [ l ] ⋮ a n l [ l ] ) = ( 1 A [ l ] ) , 0 ≤ l ≤ L
W[l]=(w[l]ij)nl×nl1,w[l]=w[l]1,0w[l]nl,0,W~[l]=(w[l]W[l]),1l<L W [ l ] = ( w i j [ l ] ) n l × n l − 1 , w ′ [ l ] = ( w 1 , 0 [ l ] ⋮ w n l , 0 [ l ] ) , W ~ [ l ] = ( w ′ [ l ] W [ l ] ) , 1 ≤ l < L

Neural Network Architecture

X=A[0]Z[1]A[1]Z[L]A[L]=Y^ X = A [ 0 ] → Z [ 1 ] → A [ 1 ] → ⋯ → Z [ L ] → A [ L ] = Y ^

Loss Function

z[l]i=j=0nl1w[l]ija~[l1]j,1inl,1lL z i [ l ] = ∑ j = 0 n l − 1 w i j [ l ] a ~ j [ l − 1 ] , 1 ≤ i ≤ n l , 1 ≤ l ≤ L
Zl=W[l]A~[l1],1lL Z l = W [ l ] A ~ [ l − 1 ] , 1 ≤ l ≤ L
a[l]i=g(z[l]i),1inl,1lL a i [ l ] = g ( z i [ l ] ) , 1 ≤ i ≤ n l , 1 ≤ l ≤ L
A[l]=g(Z[l]),1lL A [ l ] = g ( Z [ l ] ) , 1 ≤ l ≤ L
loss(X,Y)=i=1ny[yilny^i+(1yi)ln(1y^i)] loss ⁡ ( X , Y ) = − ∑ i = 1 n y [ y i ln ⁡ y ^ i + ( 1 − y i ) ln ⁡ ( 1 − y ^ i ) ]

公式

z[L]iloss(X,Y)=dy^idz[L]iy^iloss(X,Y) ∂ ∂ z i [ L ] loss ⁡ ( X , Y ) = d ⁡ y ^ i d ⁡ z i [ L ] ⋅ ∂ ∂ y ^ i loss ⁡ ( X , Y )
=g(z[L])[yi1y^i(1yi)11y^i] = − g ′ ( z [ L ] ) [ y i ⋅ 1 y ^ i − ( 1 − y i ) ⋅ 1 1 − y ^ i ]
=y^i(1y^i)[yi1y^i(1yi)11y^i] = − y ^ i ( 1 − y ^ i ) [ y i ⋅ 1 y ^ i − ( 1 − y i ) ⋅ 1 1 − y ^ i ]
=(1yi)y^iyi(1y^i) = ( 1 − y i ) y ^ i − y i ( 1 − y ^ i )
=y^iyi,1inL = y ^ i − y i , 1 ≤ i ≤ n L

z[l]jloss(X,Y)=i=1nl+1z[l+1]iz[l]jz[l+1]iloss(X,Y) ∂ ∂ z j [ l ] loss ⁡ ( X , Y ) = ∑ i = 1 n l + 1 ∂ z i [ l + 1 ] ∂ z j [ l ] ⋅ ∂ ∂ z i [ l + 1 ] loss ⁡ ( X , Y )
=i=1nl+1g(z[l]j)w[l]ijz[l+1]iloss(X,Y) = ∑ i = 1 n l + 1 g ′ ( z j [ l ] ) w i j [ l ] ⋅ ∂ ∂ z i [ l + 1 ] loss ⁡ ( X , Y )
=g(z[l]j)i=1nl+1w[l]ijz[l+1]iloss(X,Y),1jsl,1l<L = g ′ ( z j [ l ] ) ∑ i = 1 n l + 1 w i j [ l ] ⋅ ∂ ∂ z i [ l + 1 ] loss ⁡ ( X , Y ) , 1 ≤ j ≤ s l , 1 ≤ l < L
因此
Z[l]loss(X,Y)=A[L]Y,l=Lg(Z[l]) . ((W[l+1])Z[l+1]loss(X,Y)),1l<L ∂ ∂ Z [ l ] loss ⁡ ( X , Y ) = { A [ L ] − Y , l = L g ′ ( Z [ l ] )   . ∗   ( ( W [ l + 1 ] ) ⊺ ∂ ∂ Z [ l + 1 ] loss ⁡ ( X , Y ) ) , 1 ≤ l < L
where .* is element-wise product.

w[l]ijloss(X,Y)=z[l]iloss(X,Y)a~[l1]j,1isl+1,0jsl,1lL ∂ ∂ w i j [ l ] loss ⁡ ( X , Y ) = ∂ ∂ z i [ l ] loss ⁡ ( X , Y ) ⋅ a ~ j [ l − 1 ] , 1 ≤ i ≤ s l + 1 , 0 ≤ j ≤ s l , 1 ≤ l ≤ L
因此
W~[l]loss(X,Y)=Z[l]loss(X,Y)A~[l1],1lL ∂ ∂ W ~ [ l ] loss ⁡ ( X , Y ) = ∂ ∂ Z [ l ] loss ⁡ ( X , Y ) ⋅ A ~ [ l − 1 ] ⊺ , 1 ≤ l ≤ L

Multiple Samples

Symbols

X=(X(1),,X(m)), X = ( X ( 1 ) , ⋯ , X ( m ) ) ,
Y=(Y(1),,Y(m)), Y = ( Y ( 1 ) , ⋯ , Y ( m ) ) ,
Z[l]=(Z[l](1),,Z[l](m)),1lL Z [ l ] = ( Z [ l ] ( 1 ) , ⋯ , Z [ l ] ( m ) ) , 1 ≤ l ≤ L
A[l]=(A[l](1),,A[l](m)),0lL A [ l ] = ( A [ l ] ( 1 ) , ⋯ , A [ l ] ( m ) ) , 0 ≤ l ≤ L
A~[l]=(A~[l](1),,A~[l](m)),0lL A ~ [ l ] = ( A ~ [ l ] ( 1 ) , ⋯ , A ~ [ l ] ( m ) ) , 0 ≤ l ≤ L
Z[l]=(Z[l]loss(X(1),Y(1)),,Z[l]loss(X(m),Y(m)))nl×m,1lL ∂ Z [ l ] = ( ∂ ∂ Z [ l ] loss ⁡ ( X ( 1 ) , Y ( 1 ) ) , ⋯ , ∂ ∂ Z [ l ] loss ⁡ ( X ( m ) , Y ( m ) ) ) n l × m , 1 ≤ l ≤ L

Cost Function

cost(X,Y)=1mi=1mloss(X(i),Y(i)) cost ⁡ ( X , Y ) = 1 m ∑ i = 1 m loss ⁡ ( X ( i ) , Y ( i ) )

公式

Z[l]=W[l]A~[l1],1l<L Z [ l ] = W [ l ] A ~ [ l − 1 ] , 1 ≤ l < L
A[l]=g(Z[l]),1lL A [ l ] = g ( Z [ l ] ) , 1 ≤ l ≤ L
g(Z[l])=A[l] . (1nl×mA[l]),1lL g ′ ( Z [ l ] ) = A [ l ]   . ∗   ( 1 n l × m − A [ l ] ) , 1 ≤ l ≤ L

Z[l]={A[L]Y,l=Lg(Z[l]) . ((W[l+1])Z[l+1]),1l<L ∂ Z [ l ] = { A [ L ] − Y , l = L g ′ ( Z [ l ] )   . ∗   ( ( W [ l + 1 ] ) ⊺ ⋅ ∂ Z [ l + 1 ] ) , 1 ≤ l < L
W~[l]cost(X,Y)=1mZ[l]A~[l1],1lL ∂ ∂ W ~ [ l ] cost ⁡ ( X , Y ) = 1 m ∂ Z [ l ] ⋅ A ~ [ l − 1 ] ⊺ , 1 ≤ l ≤ L

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值