Reading Note: MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

TITLE: MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

AUTHOR: Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam

ASSOCIATION: Google

FROM: arXiv:1704.04861

CONTRIBUTIONS

  1. A class of efficient models called MobileNets for mobile and embedded vision applications is proposed, which are based on a streamlined architecture that uses depthwise separable convolutions to build light weight deep neural networks
  2. Two simple global hyper-parameters that efficiently trade off between latency and
    accuracy are introduced.

MobileNet Architecture

The core layer of MobileNet is depthwise separable filters, named as Depthwise Separable Convolution. The network structure is another factor to boost the performance. Finally, the width and resolution can be tuned to trade off between latency and accuracy.

Depthwise Separable Convolution

Depthwise separable convolutions which is a form of factorized convolutions which factorize a standard convolution into a depthwise convolution and a 1×1 convolution called a pointwise convolution. In MobileNet, the depthwise convolution applies a single filter to each input channel. The pointwise convolution then applies a 1×1 convolution to combine the outputs the depthwise convolution. The following figure illustrates the difference between standard convolution and depthwise separable convolution.

Difference between Standard Convolution and Depthwise Separable Convolution

The standard convolution has the computation cost of

DkDkMNDFDF

Depthwise separable convolution costs

DkDkMDFDF+MNDFDF

MobileNet Structure

The following table shows the structure of MobileNet

MobileNet Structure

Width and Resolution Multiplier

The Width Multiplier is used to reduce the number of the channels. The Resolution Multiplier is used to reduce the input image of the network.

Comparison

Comparison

MobileNets是由Google团队开发的一系列轻量级深度学习模型,专为移动设备和嵌入式视觉应用设计,旨在提供高效能的同时保持较小的计算资源消耗。它们基于深度可分离卷积(Depthwise Separable Convolutions),即将传统的卷积分为两个步骤:首先是一个深度卷积(只对输入通道做卷积),然后是一个点卷积(每个通道独立处理)。这种结构简化了网络,降低了模型复杂度。 要代码复现实现MobileNet模型,你可以选择Python语言,比如使用TensorFlow或PyTorch库。以下是一个简化的步骤概述: 1. **安装依赖**:如果你使用TensorFlow,先安装`tensorflow`和`tensorflow.keras`。对于PyTorch,安装`torch`。 ```shell pip install tensorflow keras # 或者 pip install torch torchvision ``` 2. **导入库**: ```python import tensorflow as tf # 或者 import torch, torchvision from tensorflow.keras.applications.mobilenet_v2 import MobileNetV2, preprocess_input # TensorFlow示例 # 或者 from torchvision.models import mobilenet_v2 # PyTorch示例 ``` 3. **加载预训练模型**: ```python # TensorFlow 示例 model = MobileNetV2(weights='imagenet') # PyTorch 示例 model = mobilenet_v2(pretrained=True) ``` 4. **前向传播**: ```python # 获取图像数据并预处理 input_data = ... # 图像数组或张量 output = model(input_data) # 对图像进行分类 # 获取模型输出层 last_layer = model.get_layer('out') if TensorFlow else model[-1] features = last_layer(output).detach().numpy() # 对特征进行操作 ``` 5. **定制或微调模型**: 如果你想调整模型的某些部分,可以解冻特定层进行训练,或者替换掉最后一层以适应新的任务。 ```python # TensorFlow 示例(假设我们要修改最后两层) for layer in model.layers[:-2]: layer.trainable = False # 训练新添加的层 new_model.compile(optimizer='adam', loss='categorical_crossentropy') # 调整模型后部 custom_head = tf.keras.Sequential([...]) output = custom_head(model.output) # 然后继续训练 new_model.fit(...) ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值