VGG Convolutional Neural Networks Practical（5）the theory of back-propagation

最新推荐文章于 2023-08-03 02:09:41 发布

feiyy404

最新推荐文章于 2023-08-03 02:09:41 发布

阅读量1.5k

点赞数

分类专栏： matconvnet matlab CNN 文章标签： cnn

本文链接：https://blog.csdn.net/Enjolras_fuu/article/details/53739612

版权

matlab 同时被 3 个专栏收录

56 篇文章 4 订阅

订阅专栏

matconvnet

31 篇文章 1 订阅

订阅专栏

CNN

17 篇文章 0 订阅

订阅专栏

第2部分：反向传播和衍生
例如，这是如何查找卷积运算符：

y = vl_nnconv(x,w,b) ; % forward mode (get output)
p = randn(size(y), 'single') ; % projection tensor (arbitrary)
[dx,dw,db] = vl_nnconv(x,w,b,p) ; % backward mode (get projected derivatives)

这是它如何寻找ReLU操作符：

y = vl_nnrelu(x) ;
p = randn(size(y), 'single') ;
dx = vl_nnrelu(x,p) ;

第2.1部分：在实践中使用反向传播
要了解如何在实践中使用反向传播，重点放在计算块f，后面跟一个函数g：
这里写图片描述
这里g将网络的其余部分从y到最终标量输出z。目标是在给定网络g的其余部分的导数p =∂z/∂y的情况下计算导数∂z/∂x和∂z/∂w。

让我们通过让f是卷积层并通过用随机值填充p =∂z/∂y来实现这一点：

% Read an example image
x = im2single(imread('peppers.png')) ;

% Create a bank of linear filters and apply them to the image
w = randn(5,5,3,10,'single') ;
y = vl_nnconv(x, w, []) ;

% Create the derivative dz/dy
dzdy = randn(size(y), 'single') ;

% Back-propagation
[dzdx, dzdw] = vl_nnconv(x, w, [], dzdy) ;

任务：运行上面的代码，检查dzdx和dzdy的尺寸。这是否符合您的期望？

>> size(dzdy)

ans =

   380   508    10

>> size(dzdx)

ans =

   384   512     3

该模块化视图的优点是新的构建块可以被编码并以简单的方式添加到体系结构中。然而，很容易在复数导数的计算中犯错误。因此，以数字方式验证结果是一个好主意。考虑下面的代码：

% Check the derivative numerically
ex = randn(size(x), 'single') ;
eta = 0.0001 ;
xp = x + eta * ex  ;
yp = vl_nnconv(xp, w, []) ;

dzdx_empirical = sum(dzdy(:) .* (yp(:) - y(:)) / eta) ;
dzdx_computed = sum(dzdx(:) .* ex(:)) ;

fprintf(...
  'der: empirical: %f, computed: %f, error: %.2f %%\n', ...
  dzdx_empirical, dzdx_computed, ...
  abs(1 - dzdx_empirical/dzdx_computed)*100) ;

结果：

der: empirical: 7773.693359, computed: 7774.529785, error: 0.01 %
>>

问题：
上面的代码中ex的含义是什么？
什么是导数dzdx_empirical和dzdx_computed？

任务：
运行代码并说服自己vl_nnconv衍生是（可能）正确。
创建此代码的新版本以测试关于w的导数计算。

我们现在准备建立我们的第一个基本CNN，只由两层组成，并计算其导数：

% Parameters of the CNN
w1 = randn(5,5,3,10,'single') ;
rho2 = 10 ;

% Run the CNN forward
x1 = im2single(imread('peppers.png')) ;
x2 = vl_nnconv(x1, w1, []) ;
x3 = vl_nnpool(x2, rho2) ;

% Create the derivative dz/dx3
dzdx3 = randn(size(x3), 'single') ;

% Run the CNN backward
dzdx2 = vl_nnpool(x2, rho2, dzdx3) ;
[dzdx1, dzdw1] = vl_nnconv(x1, w1, [], dzdx2) ;

问题：请注意，CNN中的最后一个导数是dzdx3。这里，为了示例的缘故，该导数被随机初始化。在实际应用中，这个导数代表什么？

我们现在可以使用与之前相同的技术来检查通过反向传播计算的导数是否正确。

% Check the derivative numerically
ew1 = randn(size(w1), 'single') ;
eta = 0.0001 ;
w1p = w1 + eta * ew1  ;

x1p = x1 ;
x2p = vl_nnconv(x1p, w1p, []) ;
x3p = vl_nnpool(x2p, rho2) ;

dzdw1_empirical = sum(dzdx3(:) .* (x3p(:) - x3(:)) / eta) ;
dzdw1_computed = sum(dzdw1(:) .* ew1(:)) ;

fprintf(...
  'der: empirical: %f, computed: %f, error: %.2f %%\n', ...
  dzdw1_empirical, dzdw1_computed, ...
  abs(1 - dzdw1_empirical/dzdw1_computed)*100) ;

der: empirical: 11502.692383, computed: 11512.068359, error: 0.08 %

feiyy404

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
2
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录