ResNeXt

版权声明:本文为博主原创文章,未经博主允许不得转载

论文:https://arxiv.org/abs/1611.05431

2017年发表在CVPR


 

创新点:

提出了”cardinality",原文是这么描述的,increasing cardinality is more effective than going deeper or wider 。

改变了传统Vgg,Resnet堆叠的思想,借用了Inception系列split-transform-merge的策略,把单路卷积转变到了多个支路的多个卷积,不过拓扑结构相同,减少了超参数的设计,便于移植。

 


原文解析:

原文中先是提到了Vgg(Resnet继承),The VGG-nets [36] exhibit a simple yet effective strategy of constructing very deep networks: stacking building blocks of the same shape,然后谈到可能会避免超参数对特定数据过适应,也就是拓展性强,原文是这么解释的:may reduce the risk of over-adapting the hyperparameters to a specific dataset.

接着原文又说起了Inception系列网络,谈到了split-transform-merge的策略,不过有个问题:网络需要精心设计,过滤器的数目大小等等,扩展性一般。原文中是这么提到的,Despite good accuracy, the realization of Inception models has been accompanied with a series of complicating factors — the filter numbers and sizes are tailored for each individual transformation 。

于是于是,作者提出了ResNeXt网络,同时采用 Vgg/ResNet 堆叠的思想和 Inception 的 split-transform-merge思想,adopts VGG/ResNets’ strategy of repeating layers, while exploiting the split-transform-merge strategy。文章中是这么描述的,A module in our network performs a set of transformations, each on a low-dimensional embedding, whose outputs are aggregated by summation. We pursuit a simple realization of this idea — the transformations to be aggregated are all of the same topology

当然当然,结果呢,作者说到了在增加准确率的同时基本不改变或降低模型的复杂度,也提到了一个新名词"cardinality'',谈到了自己的观点,Experiments demonstrate that increasing cardinality is a more effective way of gaining accuracy than going deeper or wider, especially when depth and width starts to give diminishing returns for existing models.如下图 Fig1, 右边是 cardinality=32 ,这里每个被聚合的拓扑结构都是一样的(减少了超参数的设计,减轻了负担)

                          

介绍完了,作者谈到了相关的工作:

1.Multi-branch convolutional networks.

2.Grouped convolutions.

3.Compressing convolutional networks.

4.Ensembling(这个我不是很明白,有理解的大牛能帮忙解释下吗)。原文是这么说的,Averaging a set of independently trained networks is an effective solution to improving accuracy [24], widely adopted in recognition competitions [33]. Veit et al.[40] interpret a single ResNet as an ensemble of shallower networks, which results from ResNet’s additive behaviors[15]. Our method harnesses additions to aggregate a set of transformations. But we argue that it is imprecise to view our method as ensembling, because the members to be aggregated are trained jointly, not independently.


结构图:

                             

两条设计规则:

(i)if producing spatial maps of the same size, the blocks share the same hyper-parameters (width and filter sizes),

(ii)each time when the spatial map is downsampled by a factorof 2, the width of the blocks is multiplied by a factor of 2.


ResNeXt block:

 

作者先是提到了全连接层,谈到了内积,原文是这么说的,Inner product can be thought of as a form of aggregating transformation:

                             

然后作者把wixi换成了一般的函数,原文是这么说的,we consider replacing the elementary transformation (wixi) with a more generic function, which in itself can also be a network.

                                

其中C是 cardinality,Ti是相同的拓扑结构。

最后作者展示了三种相同的 ResNeXt block。fig3.a 就是最原始的结构。 fig3.b ,有点类似 Inception-ResNet,只不过都是相同的拓扑结构。fig 3.c是group convolutions,最早应该是 AlexNet 上使用,可以减少计算量。这里 采用32个 group,每个 group 的输入输出 channels 都是4。


结果图:


下表说明了在复杂度相同的情况下提高cardinality对结果的影响。                    

 

下表主要说明了增加Cardinality和增加深度或宽度的区别。  

下表一方面证明了residual connection的有效性,也证明了aggregated transformations的有效性.

 

  • 1
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 3
    评论
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值