WAGE 邮件内容

Issue 1: How to understand scale factor α \alpha α

In your paper, you proposed a scale factor α \alpha α which is used to replace the batch-calculated scaling parameters in original Batch Normalization. I got two questions about the use of α \alpha α

  • α \alpha α always equals to 1

in your paper, α \alpha α can be calculated by functions below:
(1) α = m a x ( S h i f t ( L m i n / L ) , 1 ) \alpha = max(Shift(L_{min}/L),1) \tag 1 α=max(Shift(Lmin/L),1)(1)
(2) S h i f t ( x ) = 2 r o u n d ( l o g 2 x ) Shift(x) = 2^{round(log_{2}x)} \tag 2 Shift(x)=2round(log2x)(2)
(3) L m i n = β σ L_{min} = \beta \sigma \tag 3 Lmin=βσ(3)
Where β > 1 \beta >1 β>1 and σ ( k ) = 2 1 − k , k ∈ N + \sigma (k) = 2^{1-k},k\in N_{+} σ(k)=21k,kN+
(4) L = m a x ( 6 / n i n , L m i n ) L = max(\sqrt {6/n_{in}},L_{min}) \tag 4 L=max(6/nin ,Lmin)(4)
So we can get that
(5) L m i n / L ≤ 1 L_{min}/L \leq1 \tag 5 Lmin/L1(5)
(6) S h i f t ( L m i n / L ) ≤ 1 Shift(L_{min}/L) \leq 1 \tag 6 Shift(Lmin/L)1(6)
so (7) α = m a x ( S h i f t ( L m i n / L ) , 1 ) ≡ 1 \alpha = max(Shift(L_{min}/L) , 1) \equiv 1 \tag 7 α=max(Shift(Lmin/L),1)1(7)
Obviously, α \alpha α should not always equals to 1.Or
(8) a q = Q A ( a ) = Q ( a / α , k A ) a_{q}=Q_{A}(a)=Q(a/\alpha,k_{A}) \tag 8 aq=QA(a)=Q(a/α,kA)(8) will never be scaled

  • Why use α \alpha α

According to (3),(4),(7), α \alpha α is not relevant to current batch of data. So it’s not straight forward to understand why α \alpha α can take the place of variance which is highly relevant to current batch data.

Issue 2: Why the mean of activation can be hypothesized as 0

It’s written in the paper that:

Besides, we hypothesize that batch outputs of each hidden layer approximately have zero-mean, then …

But it seems that there is no futher explaination about the hypothesis.

Issue 3: How to shift the curve

在这里插入图片描述
Clearly, Shift(.) can change the mean of the blue curve. But it’s not stright forward that why the red curve remains excatly the same shape as the blue curve.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值