Understanding and Improving Convolutional Neural Networks via CReLU

最新推荐文章于 2023-03-20 12:13:34 发布

弓如霹雳弦惊

最新推荐文章于 2023-03-20 12:13:34 发布

阅读量3.5k

点赞数 1

分类专栏：深度学习文章标签：深度学习

本文链接：https://blog.csdn.net/Dilusense/article/details/55101360

版权

深度学习专栏收录该内容

7 篇文章 1 订阅

订阅专栏

Pairing phenomenon

论文作者在 AlexNet 的模型上做了一个有趣的实验，发现：低层的卷积层中的一些滤波器核存在着负相关程度很高的滤波器核，而层次越高的卷积层，这一现象越不明显。作者把这一现象称为 pairing phenomenon。

设网络的某层卷积层的滤波器组的卷积核的单位向量组表示为 ${\vec \phi _1},{\vec \phi _2}, \cdots ,{\vec \phi _n}$ ，定义的 pairing filter 为 ${\vec \phi _i^*} = \arg {\min _{{{\vec \phi }_j}}}\left\langle {{{\vec \phi _i }},{{\vec \phi }_j}} \right\rangle$ ，其中 $j = 1,2, \cdots ,n$ ， $n$ 为该层卷积层的滤波器数目。滤波器 ${\vec \phi _i}$ 和 ${\vec \phi _i^*}$ 之间的余弦相似度记为， $\mu _i^\phi = \left\langle {{{\vec \phi }_i},\vec \phi _i^*} \right\rangle$ 。
这里写图片描述

笔者做了相似的实验，实验代码如下：

#coding=utf-8
'''
2017年02月14日
'''
import os
import numpy as np
import matplotlib.pyplot as plt
import caffe

def plotPairFilterCosineHist(net):
    filtersList = []
    for item in net.params:
        if item.startswith('conv'):
            filtersList.append( net.params[item][0].data)
    layerNum = len(filtersList)
    for k, filters in enumerate(filtersList):
        filterNum = filters.shape[0]
        filters = filters.reshape(filterNum, -1)
        filters = filters / np.linalg.norm(filters, axis=1, keepdims=True)
        pairFilterCosine = np.empty(filterNum, dtype=np.float)
        for i in range(filterNum):
            minCosine = 1
            for j in range(filterNum):
                minCosine = min(minCosine, np.dot(filters[i], filters[j]))
            pairFilterCosine[i] = minCosine
        plt.subplot(1, layerNum, k + 1)
        plt.title('conv{}'.format(k + 1))
        plt.hist(pairFilterCosine, 10)

if __name__ == '__main__':
    caffe.set_mode_gpu()
    modelRoot = r'D:\caffe-master\models\bvlc_reference_caffenet'
    deployPrototxt =  os.path.join(modelRoot, 'deploy.prototxt')
    modelFile = os.path.join(modelRoot, 'bvlc_reference_caffenet.caffemodel')
    net = caffe.Net(deployPrototxt, modelFile, caffe.TEST)
    plotPairFilterCosineHist(net)

输出结果如下：
这里写图片描述
由上图可以得到和原论文相似的结论：直方图主要集中于负半轴，且随着卷积层的加深，直方图向右偏移。

CReLU

论文作者基于上面的现象提出了一个假设：

we hypothesize that despite ReLU erasing negative linear responses, the first few convolution layers of a deep CNN manage to capture both negative and positive phase information through learning pairs or groups of negatively correlated filters. This conjecture implies that there exists a redundancy among the filters

为了消除 ReLU 带来的冗余，论文作者提出了一个新的激活机制（activation scheme），称为Concatenated Rectified Linear Units，简称为CReLU。设 ${\left[ \cdot \right]_ + } = \max \left( { \cdot ,0} \right)$ ，则 CReLU 定义如下：

CReLU (x) = ([x] +, [- x] +)

${\mathop{\rm CReLU}\nolimits} \left( x \right) = \left( {{{\left[ x \right]}_ + },{{\left[ { - x} \right]}_ + }} \right)$

注意到其与一般激活函数的不同之处：CReLU有二维输出，而一般的激活函数只有一维输出。所以，论文中提到：CReLU is based on an activation scheme rather than a function, which fundamentally differentiates itself from Leaky ReLU or other variants. 不过笔者觉得将 CReLU 视作一维输入二维输出的激活函数也无妨。

实验和分析

最后作者在三个数据集上验证了CReLU的有效性：CIFAR-10，CIFAR-100和ImageNet。此外论文作者还从正则化和重建角度对CReLU的有效性进行了定性的讨论。

弓如霹雳弦惊

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
2
评论
Understanding and Improving Convolutional Neural Networks via CReLU

论文作者在 AlexNet 的模型上做了一个有趣的实验，发现：较低的卷积层中的一些滤波器核存在着负相关程度很高的滤波器核，而层次越高的卷积层，这一现象越不明显。作者把这一现象称为 pairing phenomenon。基于这一现象作者认为低层的卷积层具有冗余性，于是提出了CReLU。
复制链接

扫一扫