CNN 卷积神经网络-- 残差计算

最新推荐文章于 2024-08-01 07:45:00 发布

zhongkeli

最新推荐文章于 2024-08-01 07:45:00 发布

阅读量1.3w

点赞数 1

分类专栏：深度学习文章标签：神经网络 cnn

本文链接：https://blog.csdn.net/zhongkeli/article/details/51849619

版权

本文详细解析了卷积神经网络(CNN)中的卷积层和子采样层的残差计算，包括卷积计算公式、残差公式，并通过实例展示了卷积层与子采样层之间的残差传递过程，对于理解CNN的内部工作机制非常有帮助。

摘要由CSDN通过智能技术生成

前言

本文主要是解析论文Notes onConvolutional Neural Networks的公式，参考了http://blog.csdn.net/lu597203933/article/details/46575871的公式推导，借用https://github.com/BigPeng/JavaCNN代码

CNN

cnn每一层会输出多个feature map, 每个feature map由多个神经元组成，假如某个feature map的shape是m*n, 则该feature map有m*n个神经元

卷积层

卷积计算

设当前层l为卷积层，下一层l+1为子采样层subsampling.
则卷积层l的输出feature map为：
$X_j^l=f(\sum_{i\in M_j}X_i^{l-1}\ast k_{ij}^l +b_j^l)$
$\ast$ 为卷积符号

残差计算

设当前层l为卷积层，下一层l+1为子采样层subsampling.
第l层的第j个feature map的残差公式为:

$\delta_j^l = \beta_j^{l+1}(f^{'}(\mu_j^l)\circ up(\delta_j^{l+1})) \tag 1$

其中
$f(x)=\frac{1}{1+e^{-x}} \tag2$ ,
其导数

为了之后的推导，先提前讲讲subsample过程，比较简单，假设采样层是对卷积层的均值处理，如卷积层的输出feature map( $f(\mu_j^l)$ )是
卷积层的feature map
则经过subsample的结果是:
子抽样层的feature map
subsample过程如下:

import java.util.Arrays;

/**
 * Created by keliz on 7/7/16.
 */

public class test
{
   
    /**
     * 卷积核或者采样层scale的大小,长与宽可以不等.
     */
    public static class Size
    {
   

        public final int x;
        public final int y;

        public Size(int x, int y)
        {
            this.x = x;
            this.y = y;
        }

    }

    /**
     * 对矩阵进行均值缩小
     *
     * @param matrix
     * @param scale
     * @return
     */
    public static double[][] scaleMatrix(final double[][] matrix, final Size scale)
    {
        int m = matrix.length;
        int n = matrix[0].length;
        final int sm = m / scale.x;
        final int sn = n / scale.y;
        final double[][] outMatrix = new double[sm][sn];
        if (sm * scale.x != m || sn * scale.y != n)
            throw new RuntimeException("scale不能整除matrix");
        final int size = scale.x * scale.y;
        for (int i = 0; i < sm; i++)
        {
            for (int j = 0; j < sn; j++)
            {
                double sum = 0.0;
                for (int si = i * scale.x; si < (i + 1) * scale.x; si++)
                {
                    for (int sj = j * scale.y; sj < (j + 1) * scale.y; sj++)
                    {
                        sum += matrix[si][sj];
                    }
                }
                outMatrix[i][j] = sum / size;
            }
        }
        return outMatrix;
    }

    public static void main(String args[])
    {
        int row = 4;
        int column = 4;
        int k = 0;
        double[][] matrix = new double[row][column];
        Size s = new Size(2, 2);
        for (int i = 0; i < row; ++i)
            for (int j = 0; j < column; ++j)
                matrix[i][j] = ++k;
        double[][] result = scaleMatrix(matrix, s);
        System.out.println(Arrays.deepToString(matrix).replaceAll("],", "]," + System.getProperty("line.separator")));

        System.out.println(Arrays.deepToString(result).replaceAll("],", "]," + System.getProperty("line.separator")));
    }
}

其中3.5=(1+2+5+6)/(2*2); 5.5=(3+4+7+8)/(2*2)
由此可知,卷积层输出的feature map中的值为1的节点,值为2的节点,值为5的节点,值为6的节点（神经元）与subsample层的值为3.5的节点相连接，值为3,值为4,值为7,值为8节点与subsample层的值为5.5节点相连接。由BP算法章节的推导结论可知