Paper reading——GhostNet More Features from Cheap Operations

最新推荐文章于 2024-02-22 11:50:02 发布

带霸气的骑士

最新推荐文章于 2024-02-22 11:50:02 发布

阅读量390

点赞数

分类专栏：文章阅读机器学习文章标签： python

本文链接：https://blog.csdn.net/cough777/article/details/123004082

版权

GhostNet通过引入Ghost模块来使用更少的参数生成更多特征，减少计算复杂度。该模块首先通过普通卷积生成少量核心特征，然后通过低代价线性操作如仿射变换、小核卷积等生成更多特征。研究发现，Ghost模块可以作为轻量级的替代方案，用于提升网络效率。实验在CIFAR-10、ImageNet等数据集上展示了其性能和效率。

摘要由CSDN通过智能技术生成

Title

GhostNet: More Features from Cheap Operations

Year/ Authors/ Journal

2020

/Han, Kai and Wang, Yunhe and Tian, Qi and Guo, Jianyuan and Xu, Chunjing and Xu, Chang

/ CVPR2020

Citation

@inproceedings{
   han2020ghostnet,
  title={
   Ghostnet: More features from cheap operations},
  author={
   Han, Kai and Wang, Yunhe and Tian, Qi and Guo, Jianyuan and Xu, Chunjing and Xu, Chang},
  booktitle={
   Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={
   1580--1589},
  year={
   2020}
}

Summary

There are some similarity between the Encoding Layer and the Attention Block in purpose of retain more meaningful features while ignoring disturbing ones.
There may be some innovations in building a more efficient layer for the aforementioned idea and simultaneously bring the multi-size training into consideration.
Enlarge its width maybe a interesting direction.
Ghost bottleneck adopts two ghost module that the first increasing the dimensions and the second one decreases the dimension for matching the output’s dimensions requirement.

My Points

在这里插入图片描述

This ghost module uses squeeze and excitation module as its submodule, so we can consider fury SE module and ghost in one module for performance improvement.
This ghost module uses $\in [1, 2]$ , so we can consider more available $s$ .
The use of SE module has no guidance.
low-cost linear operations to construct the Ghost module such as affine transformation and wavelet transformation.
Adopt multi-size convolution for ghost operation.

Research Objective(s)

Fig. 1. Visualization of some feature maps generated by the first residual group in ResNet-50, where three similar feature map pair examples are annotated with boxes of the same color. One feature map in the pair can be approximately obtained by transforming the other one through cheap operations (denoted by spanners).

Contributions

In this paper, we introduce a novel Ghost module to generate more features by using fewer parameters. Specifically, an ordinary convolutional layer in deep neural networks will be split into two parts. The first part involves ordinary convolutions but their total number will be rigorously controlled. Given the intrinsic feature maps from the first part, a series of simple linear operations are then applied for generating more feature maps. Without changing the size of output feature map, the overall required number of parameters and computational complexities in this Ghost module have been decreased, compared with those in vanilla convolutional neural networks. Based on Ghost module, we establish an efficient neural architecture, namely, GhostNet.

Interesting Point(s)

The ghost module uses linear transformation for producing more redundancy feature maps in a cheap way for less computition cost.

Ghost module.
Linear transforming.

Background / Problem Statement

Traditional handle methods, such as FV, SIFT, have the advantages of accepting arbitrary input images sizes and have no issue when transferring features cross different domains since the low-level features are generic. However, these methods are comprised of stacking self-constrained algorithmic components( feature extraction, dictionary learning, encoding, classifier training) as visualized in Figure 1 (left, center). Consequently,
they have the disadvantage that the features and the encoders are fixed once built, so that feature learning (CNNs and dictionary) does not benefit from labeled data. We present a new approach (Figure 1, right) where the entire pipeline is learned in an end-to-end manner.
Deep-learning is well known as an end-to-end learning of hierarchical features. This paper transfers this method to recognizing textures which needs the spatially invariant representation describing the feature distributions instead of concatenation.
The challenge is to make the loss function differentiable with respect to the inputs and layer parameters.

All in one word: for better transferring the deep-learning method into texture recognition. Since model in this field always with pretrained in large dataset (such as ImageNet).

Method(s)

Figure 2: The Encoding Layer learns an inherent Dictionary. An illustration of the convolutional layer and the proposed Ghost module for outputting the same number of feature maps. $\Phi$ represents the cheap operation.

Ghost Module

We point out that it is unnecessary to generate these redundant feature maps one by one with large number of $F L O P s$ and parameters.

Suppose that the output feature maps are “ghosts” of a handful of intrinsic feature maps with some cheap transformations. These intrinsic feature maps are often of smaller size and produced by ordinary convolution filters. Specifically, m intrinsic feature maps $Y ′ ∈ R^{h′×w′ ×m}$ are generated using a primary convolution:
$*f',\qquad(2)$
where $f′ ∈ R^{c×k×k×m}$ is the utilized filters, $m \leq n$ and the bias term is omitted for simplicity. The hyper-parameters such as filter size, stride, padding, are the same as those in the ordinary convolution (Eq. 1) to keep the spatial size (i.e. $h'$ and $w'$ ) of the output feature maps consistent. To further obtain the desired n feature maps, we propose to apply a series of cheap linear operations on each intrinsic feature in $Y'$ to generate $s$ ghost features according to the following function
$y_{ij} = \Phi_{i,j}(y^′_i), \quad ∀ i = 1, ..., m,\quad j = 1, ..., s,\qquad(3)$
where $y^′_i$ is the i-th in