READING NOTE: Wider or Deeper: Revisiting the ResNet Model for Visual Recognition

TITLE: Wider or Deeper: Revisiting the ResNet Model for Visual Recognition

AUTHOR: Zifeng Wu, Chunhua Shen, Anton van den Hengel

ASSOCIATION: The University of Adelaide

FROM: arXiv:1611.10080

CONTRIBUTIONS

  1. A further developed intuitive view of ResNets is introduced, which helps to understand their behaviours and find possible directions to further improvements.
  2. A group of relatively shallow convolutional networks is proposed based on our new understanding. Some of them achieve the state-of-the-art results on the ImageNet classification datase.
  3. The impact of using different networks is evaluated on the performance of semantic image segmentation, and these networks, as pre-trained features, can boost existing algorithms a lot.

SUMMARY

For the residual unit i , let yi1 be the input, and let fi() be its trainable non-linear mappings, also named Blok i . The output of unit i is recursively defined as

yi=fi(yi1,ωi)+yi1

where ωi denotes the trainalbe parameters, and fi() is often two or three stacked convolution stages in a ResNet building block. Then top left network can be formulated as

y2=y1+f2(y1,ω2)

=y0+f1(y0,ω1)+f2(y0+f1(y0,ω1),ω2)

Thus, in SGD iteration, the backward gradients are:

Δω2=dfsdω2Δy2

Δy1=Δy2+f2Δy2

Δω1=df1dω1Δy2+df1dω1f2Δy2

Ideally, when effective depth l2 , both terms of Δω1 are non-zeros as the bottom-left case illustrated. However, when effective depth l=1 , the second term goes to zeros, which is illustrated by the bottom-right case. If this case happens, we say that the ResNet is over-deepened, and that it cannot be trained in a fully end-to-end manner, even with those shortcut connections.

To summarize, shortcut connections enable us to train wider and deeper networks. As they growing to some point, we will face the dilemma between width and depth. From that point, going deep, we will actually get a wider network, with extra features which are not completely end-to-end trained; going wider, we will literally get a wider network, without changing its end-to-end characteristic.

The author designed three kinds of network structure as illustrated in the following figure

and the classification performance on ImageNet validation set is shown as below

我们在阅读时可以通过划分思维组来提高阅读速度。所谓思维组,即是将一段文本划分为有意义的短语或句子。通过在脑海中组织这些思维组,读者可以更快地理解和吸收信息。 首先,划分思维组可以帮助我们更好地理解文章内容。当我们将一段话分成独立的短语或句子时,我们可以更容易地捕捉到句子之间的逻辑关系和主题之间的转折、连接等关系。这种分块阅读的方式可以减少阅读时的重复和混淆,使我们能够更好地理解整篇文章的结构和主旨。 其次,划分思维组可以提高我们的阅读速度。当我们分块阅读时,我们可以更快地将注意力集中在每个思维组上,而不是被整个长段文字吓到。这样一来,我们可以更高效地阅读,因为我们在每个思维组中的理解速度会更快。此外,将文本分解为小块也可以帮助我们跳过一些不重要的内容,并专注于关键信息,这会进一步提高阅读速度和效率。 最后,划分思维组可以让我们更好地处理长篇文章。当我们面对很长的文本时,常常会觉得无从下手,感到疲劳和压力。但是,通过将文本分为思维组,我们可以将阅读任务分解为一系列小任务,这样就能够更好地管理我们的时间和注意力。我们可以通过完成一组思维组,然后休息一下,再进行下一组,以此来保持阅读的连贯性和活力。 总之,阅读时划分思维组有助于提高阅读速度。这种方法可以帮助我们更好地理解文章内容,提高阅读效率,并更好地处理长篇文章。通过逐步锻炼和训练,我们可以将这种阅读方式变得更加自然和流畅。
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值