Log-Sum-Exp Pooling
Papers
- From Image-level to Pixel-level Labeling with Convolutional Networks
- ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases
LSE Pooling
在阅读这两篇文章之前,我印象中常用的 Pooling 有 Max Pooling 和 Average Pooling,而这两篇文章中用到了 Log-Sum-Exp Pooling,其定义为:
x p = 1 r ⋅ l o g [ 1 S ⋅ ∑ ( i , j ) ∈ S e x p ( r ⋅ x i j ) ] x_p=\frac{1}{r}\cdot log[\frac{1}{S}\cdot \sum_{(i,j)\in\mathbf{S}}exp(r\cdot x_{ij})] xp=r1⋅log[S1⋅(i,j)∈S∑exp(r⋅xij)]
其中, x i j x_{ij} xij 表示在 ( i , j ) (i,j) (i,j)的激活值, ( i , j ) (i,j) (i,j) 是池化区域 S \mathbf{S} S 的一点并且 S = s × s S=s\times s S=s×s 是池化区域 S \mathbf{S} S 总点数, r r r 是超参数。
在第一篇文章中,作者提到 LSE Pooling 的作用为:
The hyper-parameter r controls how smooth one wants the approximation to be: high r values implies having an effect similar to the max, very low values will have an effect similar to the score averaging. The advantage of this aggregation is that pixels having similar scores will have a similar weight in the training procedure, r controlling this notion of “similarity”.
在第二篇文章中,作者提到 LSE Pooling 的作用为:
By controlling the hyper-parameter, r, the pooled value ranges from the maximum in S (when r → ∞ r\to\infin r→∞) to average ( r → 0 r\to0 r→0).
一个直观的理解可以看下图:
数学证明
作为一个严谨的大学僧,肯定不会止步于直观理解啦,数学证明走起!
在证明前,不妨把式子简化一点:
x p = 1 r ⋅ l o g [ 1 n ⋅ ∑ i = 1 n e x p ( r ⋅ x i ) ] x_p=\frac{1}{r}\cdot log[\frac{1}{n}\cdot \sum_{i=1}^{n}exp(r\cdot x_i)] xp=r1⋅log[n1⋅i=1∑nexp(r⋅xi)]
证明 r → 0 r\to0 r→0 相当于 Average Pooling
首先,我们需要借助均值不等式:
a 1 + a 2 + . . . + a n n ≥ a 1 ⋅ a 2 . . . a n n \frac{a_1+a_2+...+a_n}{n}\ge\sqrt[n]{a_1\cdot a_2...a_n} na1+a