【Course1】3 One hidden layer Neural Network

Neural Network Representation

Computing a Neural Network’s Output

l l l 层的第 i i i 个神经元(单个样本):

  • 参数 w i [ l ] = [ w 1 [ l ] w 2 [ l ] ⋮ w n [ l − 1 ] [ l ] ] , b i [ l ] w_i^{[l]}=\begin{bmatrix}w_1^{[l]} \\ w_2^{[l]} \\ \vdots \\ w_{n^{[l-1]}}^{[l]}\end{bmatrix}, b_i^{[l]} wi[l]= w1[l]w2[l]wn[l1][l] ,bi[l]
  • 输入 a [ l − 1 ] , s h a p e = ( n [ l − 1 ] , 1 ) a^{[l-1]}, shape = (n^{[l-1]}, 1) a[l1],shape=(n[l1],1)
  • 执行两步计算
    1. z i [ l ] = w i [ l ] T a [ l − 1 ] + b i [ l ] z_i^{[l]} = w_i^{[l]T}a^{[l-1]}+b_i^{[l]} zi[l]=wi[l]Ta[l1]+bi[l]
    2. a i [ l ] = σ ( z i [ l ] ) a_i^{[l]} = \sigma{(z_i^{[l]})} ai[l]=σ(zi[l])
  • 输出 a i [ l ] , s c a l a r a_i^{[l]}, scalar ai[l],scalar
    在这里插入图片描述

l l l 层(单个样本)

非矢量化

z 1 [ l ] = w 1 [ l ] T a [ l − 1 ] + b 1 [ l ] , a 1 [ l ] = σ ( z 1 [ l ] ) z 2 [ l ] = w 2 [ l ] T a [ l − 1 ] + b 2 [ l ] , a 2 [ l ] = σ ( z 2 [ l ] ) ⋮ z n [ l ] [ l ] = w n [ l ] [ l ] T a [ l − 1 ] + b n [ l ] [ l ] , a n [ l ] [ l ] = σ ( z n [ l ] [ l ] ) z_1^{[l]} = w_1^{[l]T}a^{[l-1]}+b_1^{[l]}, a_1^{[l]} = \sigma{(z_1^{[l]})}\\ z_2^{[l]} = w_2^{[l]T}a^{[l-1]}+b_2^{[l]}, a_2^{[l]} = \sigma{(z_2^{[l]})}\\ \vdots \\ z_{n^{[l]}}^{[l]} = w_{n^{[l]}}^{[l]T}a^{[l-1]}+b_{n^{[l]}}^{[l]}, a_{n^{[l]}}^{[l]} = \sigma{(z_{n^{[l]}}^{[l]})} z1[l]=w1[l]Ta[l1]+b1[l],a1[l]=σ(z1[l])z2[l]=w2[l]Ta[l1]+b2[l],a2[l]=σ(z2[l])zn[l][l]=wn[l][l]Ta[l1]+bn[l][l],an[l][l]=σ(zn[l][l])

不同的下标对应某一层中不同的神经元,这组公式实际上是对该层的每一个神经元都执行了相同的计算,下标 i i i 1 1 1 变化到 n [ l ] n^{[l]} n[l] 分别对应该层的第 1 1 1 到第 n [ l ] n^{[l]} n[l] 个神经元。

在这里插入图片描述

矢量化

这一步矢量化的目的是让每一层的所有神经元同时进行计算,也就是将上面的 n [ l ] n^{[l]} n[l] 个公式合为一个,也就是与“层”相关的矢量化。

矢量化的方法:将与“层”相关的量—— w, b 一行一行地堆叠起来 / 按行排列 (stack by column)
W [ l ] = [ − − w 1 [ l ] T − − − − w 2 [ l ] T − − ⋮ − − w n [ l ] [ l ] T − − ] , b [ l ] = [ b 1 [ l ] b 2 [ l ] ⋮ b n [ l ] [ l ] ] W^{[l]} = \begin{bmatrix} --w_1^{[l]T}--\\ --w_2^{[l]T}--\\ \vdots \\ --w_{n^{[l]}}^{[l]T}-- \end{bmatrix}, b^{[l]} = \begin{bmatrix} b_1^{[l]} \\ b_2^{[l]} \\ \vdots \\ b_{n^{[l]}}^{[l]} \end{bmatrix} W[l]= w1[l]Tw2[l]Twn[l][l]T ,b[l]= b1[l]b2[l]bn[l][l]

记号

与样本相关的量:x, z, a (stack by column)
X = A [ 0 ] = [ ∣ ∣ . . . ∣ x ( 1 ) x ( 2 ) . . . x ( m ) ∣ ∣ . . . ∣ ] Z [ l ] = [ ∣ ∣ . . . ∣ z [ l ] ( 1 ) z [ l ] ( 2 ) . . . z [ l ] ( m ) ∣ ∣ . . . ∣ ] A [ l ] = [ ∣ ∣ ∣ a [ l − 1 ] ( 1 ) a [ l − 1 ] ( 2 ) . . . a [ l − 1 ] ( m ) ∣ ∣ ∣ ] \begin{aligned} &X = A^{[0]} = \begin{bmatrix} | & | & ... & | \\ x^{(1)} & x^{(2)} & ... & x^{(m)} \\ | & | & ... & | \end{bmatrix}\\ &Z^{[l]} = \begin{bmatrix} | & | & ... & | \\ z^{[l](1)} & z^{[l](2)} & ... & z^{[l](m)}\\ | & | & ... & | \end{bmatrix}\\ &A^{[l]} = \begin{bmatrix} | & | & & | \\ a^{[l-1](1)} & a^{[l-1](2)} & ... & a^{[l-1](m)} \\ | & | & & | \end{bmatrix} \end{aligned} X=A[0]= x(1)x(2).........x(m) Z[l]= z[l](1)z[l](2).........z[l](m) A[l]= a[l1](1)a[l1](2)...a[l1](m)

l l l 层的前向传播计算公式

Z [ l ] = W [ l ] A [ l − 1 ] + b [ l ] = [ − − w 1 [ l ] T − − − − w 2 [ l ] T − − ⋮ − − w n [ l ] T [ l ] − − ] [ ∣ ∣ ∣ a [ l − 1 ] ( 1 ) a [ l − 1 ] ( 2 ) . . . a [ l − 1 ] ( m ) ∣ ∣ ∣ ] + [ b 1 [ l ] b 2 [ l ] ⋮ b n [ l ] [ l ] ] = [ w 1 [ l ] T a [ l − 1 ] ( 1 ) . . . w 1 [ l ] T a [ l − 1 ] ( m ) w 2 [ l ] T a [ l − 1 ] ( 1 ) . . . w 2 [ l ] T a [ l − 1 ] ( m ) ⋮ . . . ⋮ w n [ l ] [ l ] a [ l − 1 ] ( 1 ) . . . w n [ l ] [ l ] a [ l − 1 ] ( m ) ] + [ b 1 [ l ] . . . b 1 [ l ] b 2 [ l ] . . . b 2 [ l ] ⋮ . . . ⋮ b n [ l ] [ l ] . . . b n [ l ] [ l ] ] = [ w 1 [ l ] a [ l − 1 ] ( 1 ) + b 1 [ l ] . . . w 1 [ l ] a [ l − 1 ] ( m ) + b 1 [ l ] w 2 [ l ] a [ l − 1 ] ( 1 ) + b 2 [ l ] . . . w 2 [ l ] a [ l − 1 ] ( m ) + b 2 [ l ] ⋮ . . . ⋮ w n [ l ] [ l ] a [ l − 1 ] ( 1 ) + b n [ l ] [ l ] . . . w n [ l ] [ l ] a [ l − 1 ] ( m ) + b n [ l ] [ l ] ] = [ z 1 [ l ] ( 1 ) . . . z 1 [ l ] ( m ) z 2 [ l ] ( 1 ) . . . z 1 [ l ] ( m ) ⋮ . . . ⋮ z n [ l ] [ l ] ( 1 ) . . . z n [ l ] [ l ] ( m ) ] = [ ∣ ∣ . . . ∣ z [ l ] ( 1 ) z [ l ] ( 2 ) . . . z [ l ] ( m ) ∣ ∣ . . . ∣ ] \begin{aligned} Z^{[l]} &= W^{[l]}A^{[l-1]}+b^{[l]}\\ &=\begin{bmatrix} --w_1^{[l]T}--\\ --w_2^{[l]T}--\\ \vdots \\ --w_{n^{[l]}T}^{[l]}-- \end{bmatrix} \begin{bmatrix} | & | & & | \\ a^{[l-1](1)} & a^{[l-1](2)} & ... & a^{[l-1](m)} \\ | & | & & | \end{bmatrix} +\begin{bmatrix} b_1^{[l]} \\ b_2^{[l]} \\ \vdots \\ b_{n^{[l]}}^{[l]} \end{bmatrix}\\ &=\begin{bmatrix} w_1^{[l]T}a^{[l-1](1)} & ... & w_1^{[l]T}a^{[l-1](m)} \\ w_2^{[l]T}a^{[l-1](1)} & ... &w_2^{[l]T}a^{[l-1](m)}\\ \vdots & ...&\vdots\\ w_{n^{[l]}}^{[l]}a^{[l-1](1)} &...& w_{n^{[l]}}^{[l]}a^{[l-1](m)} \end{bmatrix} +\begin{bmatrix} b_1^{[l]} & ... & b_1^{[l]}\\ b_2^{[l]} & ... & b_2^{[l]}\\ \vdots & ... & \vdots\\ b_{n^{[l]}}^{[l]} & ... & b_{n^{[l]}}^{[l]} \end{bmatrix}\\ &=\begin{bmatrix} w_1^{[l]}a^{[l-1](1)}+b_1^{[l]} & ... & w_1^{[l]}a^{[l-1](m)}+b_1^{[l]}\\ w_2^{[l]}a^{[l-1](1)}+b_2^{[l]} & ... & w_2^{[l]}a^{[l-1](m)}+b_2^{[l]}\\ \vdots & ... & \vdots\\ w_{n^{[l]}}^{[l]}a^{[l-1](1)}+b_{n^{[l]}}^{[l]} & ... & w_{n^{[l]}}^{[l]}a^{[l-1](m)}+b_{n^{[l]}}^{[l]} \end{bmatrix}\\ &=\begin{bmatrix} z_1^{[l](1)} & ... & z_1^{[l](m)}\\ z_2^{[l](1)} & ... & z_1^{[l](m)}\\ \vdots & ... & \vdots\\ z_{n^{[l]}}^{[l](1)} & ... & z_{n^{[l]}}^{[l](m)} \end{bmatrix}\\ &=\begin{bmatrix} | & | & ... & | \\ z^{[l](1)} & z^{[l](2)} & ... & z^{[l](m)}\\ | & | & ... & | \end{bmatrix} \end{aligned}\\ Z[l]=W[l]A[l1]+b[l]= w1[l]Tw2[l]Twn[l]T[l] a[l1](1)a[l1](2)...a[l1](m) + b1[l]b2[l]bn[l][l] = w1[l]Ta[l1](1)w2[l]Ta[l1](1)wn[l][l]a[l1](1)............w1[l]Ta[l1](m)w2[l]Ta[l1](m)wn[l][l]a[l1](m) + b1[l]b2[l]bn[l][l]............b1[l]b2[l]bn[l][l] = w1[l]a[l1](1)+b1[l]w2[l]a[l1](1)+b2[l]wn[l][l]a[l1](1)+bn[l][l]............w1[l]a[l1](m)+b1[l]w2[l]a[l1](m)+b2[l]wn[l][l]a[l1](m)+bn[l][l] = z1[l](1)z2[l](1)zn[l][l](1)............z1[l](m)z1[l](m)zn[l][l](m) = z[l](1)z[l](2).........z[l](m)
A [ l ] = σ ( Z [ l ] ) = σ ( [ ∣ ∣ . . . ∣ z [ l ] ( 1 ) z [ l ] ( 2 ) . . . z [ l ] ( m ) ∣ ∣ . . . ∣ ] ) = [ ∣ ∣ . . . ∣ σ ( z [ l ] ( 1 ) ) σ ( z [ l ] ( 2 ) ) . . . σ ( z [ l ] ( m ) ) ∣ ∣ . . . ∣ ] = [ ∣ ∣ ∣ a [ l − 1 ] ( 1 ) a [ l − 1 ] ( 2 ) . . . a [ l − 1 ] ( m ) ∣ ∣ ∣ ] \begin{aligned} A^{[l]} &= \sigma (Z^{[l]}) \\ &= \sigma (\begin{bmatrix} | & | & ... & | \\ z^{[l](1)} & z^{[l](2)} & ... & z^{[l](m)}\\ | & | & ... & | \end{bmatrix}) \\ &=\begin{bmatrix} | & | & ... & | \\ \sigma(z^{[l](1)}) & \sigma(z^{[l](2)}) & ... & \sigma(z^{[l](m)})\\ | & | & ... & | \end{bmatrix}\\ &=\begin{bmatrix} | & | & & | \\ a^{[l-1](1)} & a^{[l-1](2)} & ... & a^{[l-1](m)} \\ | & | & & | \end{bmatrix} \end{aligned} A[l]=σ(Z[l])=σ( z[l](1)z[l](2).........z[l](m) )= σ(z[l](1))σ(z[l](2)).........σ(z[l](m)) = a[l1](1)a[l1](2)...a[l1](m)

整个神经网络的前向传播计算公式

A [ 0 ] = X = [ ∣ ∣ . . . ∣ x ( 1 ) x ( 2 ) . . . x ( m ) ∣ ∣ . . . ∣ ] Z [ 1 ] = W [ 1 ] A [ 0 ] + b [ 1 ] , A [ 1 ] = σ ( Z [ 1 ] ) Z [ 2 ] = W [ 2 ] A [ 1 ] + b [ 2 ] , A [ 2 ] = σ ( Z [ 2 ] ) ⋮ Z [ l ] = W [ l ] A [ l − 1 ] + b [ l ] , A [ l ] = σ ( Z [ l ] ) A^{[0]} = X = \begin{bmatrix} | & | & ... & | \\ x^{(1)} & x^{(2)} & ... & x^{(m)} \\ | & | & ... & | \end{bmatrix} \\ Z^{[1]} = W^{[1]}A^{[0]}+b^{[1]}, A^{[1]} = \sigma (Z^{[1]}) \\ Z^{[2]} = W^{[2]}A^{[1]}+b^{[2]}, A^{[2]} = \sigma (Z^{[2]}) \\ \vdots \\ Z^{[l]} = W^{[l]}A^{[l-1]}+b^{[l]}, A^{[l]} = \sigma (Z^{[l]}) A[0]=X= x(1)x(2).........x(m) Z[1]=W[1]A[0]+b[1],A[1]=σ(Z[1])Z[2]=W[2]A[1]+b[2],A[2]=σ(Z[2])Z[l]=W[l]A[l1]+b[l],A[l]=σ(Z[l])

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
Here is an example code for a one-dimensional convolutional wavelet neural network using PyTorch: ```python import torch import torch.nn as nn import pywt class ConvWaveletNet(nn.Module): def __init__(self, num_classes): super(ConvWaveletNet, self).__init__() self.conv1 = nn.Conv1d(1, 16, kernel_size=3, stride=1, padding=1) self.relu1 = nn.ReLU() self.pool1 = nn.MaxPool1d(kernel_size=2, stride=2) self.conv2 = nn.Conv1d(16, 32, kernel_size=3, stride=1, padding=1) self.relu2 = nn.ReLU() self.pool2 = nn.MaxPool1d(kernel_size=2, stride=2) self.conv3 = nn.Conv1d(32, 64, kernel_size=3, stride=1, padding=1) self.relu3 = nn.ReLU() self.pool3 = nn.MaxPool1d(kernel_size=2, stride=2) self.fc1 = nn.Linear(64 * 4, 128) self.relu4 = nn.ReLU() self.fc2 = nn.Linear(128, num_classes) def forward(self, x): # Apply wavelet transform to the input signal cA, cD = pywt.dwt(x, 'db1') x = cA + cD x = torch.tensor(x).unsqueeze(0).unsqueeze(0).float() # add batch and channel dimensions # Convolutional layers x = self.conv1(x) x = self.relu1(x) x = self.pool1(x) x = self.conv2(x) x = self.relu2(x) x = self.pool2(x) x = self.conv3(x) x = self.relu3(x) x = self.pool3(x) # Fully connected layers x = x.view(-1, 64 * 4) x = self.fc1(x) x = self.relu4(x) x = self.fc2(x) return x ``` This network consists of three convolutional layers followed by two fully connected layers. The input signal is first transformed using the discrete wavelet transform, and then passed through the convolutional layers. The output of the last convolutional layer is flattened and passed through the fully connected layers to produce the final classification result. Note that this implementation uses the 'db1' wavelet for the wavelet transform, but other wavelets can also be used.
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值