1、Is Second-order Information Helpful for Large-scale Visual Recognition?
标题 | Is Second-order Information Helpful for Large-scale Visual Recognition? |
---|---|
论文地址 | https://arxiv.org/pdf/1703.08050 |
项目地址 | http://www.peihuali.org/MPN-COV/ |
The main challenges involved are robust covariance estimation provided only with a small sample of large-dimensional features and usage of the manifold structure of the covariance matrices.
涉及的主要挑战是仅提供大维特征的小样本的鲁棒协方差估计以及协方差矩阵的流形结构的使用。(简单说就是大的样本设定下表现差)
Our main contributions are summarized as follows. Firstly, we are among the first who attempt to exploit statistics higher than first-order for improving the large-scale classification. We propose matrix power normalized covariance method for more discriminative representations, and develop the forward and backward propagation formulas for the nonlinear matrix functions, achieving end-toend MPN-COV networks. Secondly, we provide interpretations of MPN-COV from statistical, geometric and computational points of view, explaining the underlying mechanism that MPN-COV can address the aforementioned challenges. Thirdly, on the ImageNet 2012 dataset, we thoroughly evaluate MPN-COV for validating our mathematical derivation and understandings, obtaining competitive improvements over its first-order counterparts under a variety of ConvNet architectures.
主要贡献是:
首先,是最早尝试利用高于一阶的统计数据来改进大规模分类的人之一。我们提出了矩阵幂归一化协方差方法以获得更具判别性的表示,并开发了非线性矩阵函数的前向和后向传播公式,实现了端到端的 MPN-COV 网络。其次,我们从统计、几何和计算的角度对 MPN-COV 进行了解释,解释了 MPN-COV 能够解决上述挑战的潜在机制。第三,在 ImageNet 2012 数据集上,我们彻底评估了 MPN-COV,以验证我们的数学推导和理解,在各种 ConvNet 架构下获得相对于一阶对应物的竞争性改进。
主要方法
在最后一个卷积和FC层之间加入协方差层。
2、Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization
标题 | Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization |
---|---|
论文地址 | https://arxiv.org/abs/1712.01034v2 |
项目地址 | http://www.peihuali.org/iSQRT-COV |
However, existing methods depend heavily on eigendecomposition (EIG) or singular value decomposition (SVD), suffering from inefficient training due to limited support of EIG and SVD on GPU.
现有方法严重依赖特征分解(EIG)或奇异值分解(SVD),由于 GPU 对 EIG 和 SVD 的支持有限,导致训练效率低下。这篇文章主要提效率,不看O.o
3、Global Second-order Pooling Convolutional Networks
标题 | Global Second-order Pooling Convolutional Networks |
---|---|
论文地址 | https://arxiv.org/abs/1811.12006 |
项目地址 | 妹有 |
尝试在model早期层中加入二阶池化。
Our main contributions are threefold. (1) Distinct from the existing methods which can only exploit second-order statistics at network end, we are among the first who introduce this modeling into intermediate layers for making use of holistic image information in earlier stages of deep ConvNets. By modeling the correlations of the holistic tensor, the proposed blocks can capture longrange statistical dependency [34], making full use of the contextual information in the image. (2) We design a simple yet effective GSoP block, which is highly modular with low memory and computational complexity. The GSoP block, which is able to capture global second-order statistics along channel dimension or position dimension, can be conveniently plugged into existing network architectures, further improving their performance with small overhead. (3) On ImageNet benchmark, we perform a thorough ablation study of the proposed networks, analyzing the characteristics and behaviors of the proposed GSoP block. Extensive comparison with the counterparts has shown the competitiveness of our networks.
我们的主要贡献有三方面。 (1)与只能在网络末端利用二阶统计量的现有方法不同,我们是最早将这种建模引入中间层以在深度ConvNets的早期阶段利用整体图像信息的人之一。通过对整体张量的相关性进行建模,所提出的块可以捕获长程统计依赖性[34],充分利用图像中的上下文信息。 (2) 我们设计了一个简单而有效的 GSoP 块,它高度模块化,内存和计算复杂度低。 GSoP模块能够沿着通道维度或位置维度捕获全局二阶统计数据,可以方便地插入现有的网络架构中,以较小的开销进一步提高其性能。 (3) 在 ImageNet 基准上,我们对所提出的网络进行了彻底的消融研究,分析了所提出的 GSoP 块的特征和行为。与同行的广泛比较显示了我们网络的竞争力。
两种不同的结构,分别在模型中间或者最后。
Imagenet上的实验结果: