文章地址: https://arxiv.org/pdf/1404.0736.pdf
代码地址: https://cs.nyu.edu/~denton/compress_conv.zip
Contribution.
- A collection of generic methods to exploit the redundancy inherent in deep CNNs.
- Showing empirical speedups on convolutional layers by a factor of 2-3x and a reduction of parameters in fully connected layers by a factor of 5-10x.
Monochronmatic Convolution Approximation.
Let W ∈ R C × X × Y × F ( 96 , 7 , 7 , 3 ) W\in \mathbb{R}^{C\times X \times Y \times F} (96,7,7,3) W∈RC×X×Y×F(96,7,7,3)
For every output feature f f f, consider the matrix W f ∈ R C × ( X Y ) W_f \in \mathbb{R}^{C\times (XY)} Wf∈RC×(XY)
Find the SVD, W f = U f S f V f T W_f = U_fS_fV_f^{T} Wf=UfSfVfT, where U f ∈ R C × C ( 3 , 3 ) , S f ∈ R C × X Y ( 3 , 7 × 7 = 49 ) , V f ∈ R X Y × X Y ( 49 , 49 ) U_f \in \mathbb{R}^{C\times C }(3,3), S_f \in \mathbb{R}^{C\times XY}(3,7\times 7 =49), V_f \in \mathbb{R}^{XY\times XY}(49,49) Uf∈RC×C(3,3),Sf∈RC×XY(3,7×7=