python keras_Python 3 & Keras 实现Mobilenet v2

最新推荐文章于 2023-10-01 00:21:13 发布

weixin_39595537

最新推荐文章于 2023-10-01 00:21:13 发布

阅读量341

点赞数

文章标签： python keras

MobileNet是Google提出来的移动端分类网络。在V1中，MobileNet应用了深度可分离卷积(Depth-wise Seperable Convolution)并提出两个超参来控制网络容量，这种卷积背后的假设是跨channel相关性和跨spatial相关性的解耦。深度可分离卷积能够节省参数量省，在保持移动端可接受的模型复杂性的基础上达到了相当的高精度。而在V2中，MobileNet应用了新的单元：Inverted residual with linear bottleneck，主要的改动是为Bottleneck添加了linear激活输出以及将残差网络的skip-connection结构转移到低维Bottleneck层。

网络结构

MobileNetV2的整体结构如下图所示。每行描述一个或多个相同(步长)层的序列，每个bottleneck重复n次。相同序列中的所有层具有相同数量的输出通道。每个序列的第一层有使用步长s，所有其他层使用步长1。所有的空间卷积使用3 * 3的内核。扩展因子t始终应用于输入大小。假设输入某一层的tensor的通道数为k，那么应用在这一层上的filters数就为 k * t。

net.jpg

Bottleneck的结构如下所示，根据使用的步长大小来决定是否使用skip-connection结构。

stru.jpg

环境

OpenCV 3.4

Python 3.5

Tensorflow-gpu 1.2.0

Keras 2.1.3

实现

基于论文给出的参数，我使用Keras 2实现了网络结构，如下所示：

from keras.models import Model

from keras.layers import Input, Conv2D, GlobalAveragePooling2D, Dropout

from keras.layers import Activation, BatchNormalization, add, Reshape

from keras.applications.mobilenet import relu6, DepthwiseConv2D

from keras.utils.vis_utils import plot_model

from keras import backend as K

def _conv_block(inputs, filters, kernel, strides):

"""Convolution Block

This function defines a 2D convolution operation with BN and relu6.

# Arguments

inputs: Tensor, input tensor of conv layer.

filters: Integer, the dimensionality of the output space.

kernel: An integer or tuple/list of 2 integers, specifying the

width and height of the 2D convolution window.

strides: An integer or tuple/list of 2 integers,

specifying the strides of the convolution along the width and height.

Can be a single integer to specify the same value for

all spatial dimensions.

# Returns

Output tensor.

"""

channel_axis = 1 if K.image_data_format() == 'channels_first' else -1

x = Conv2D(filters, kernel, padding='same', strides=strides)(inputs)

x = BatchNormalization(axis=channel_axis)(x)

return Activation(relu6)(x)

def _bottleneck(inputs, filters, kernel, t, s, r=False):

"""Bottleneck

This function defines a basic bottleneck structure.

# Arguments

inputs: Tensor, input tensor of conv layer.

filters: Integer, the dimensionality of the output space.

kernel: An integer or tuple/list of 2 integers, specifying the

width and height of the 2D convolution window.

t: Integer, expansion factor.

t is always applied to the input size.

s: An integer or tuple/list of 2 integers,specifying the strides

of the convolution along the width and height.Can be a single

integer to specify the same value for all spatial dimensions.

r: Boolean, Whether to use the residuals.

# Returns

Output tensor.

"""

channel_axis = 1 if K.image_data_format() == 'channels_first' else -1

tchannel = K.int_shape(inputs)[channel_axis] * t

x = _conv_block(inputs, tchannel, (1, 1), (1, 1))

x = DepthwiseConv2D(kernel, strides=(s, s), depth_multiplier=1, padding='same')(x)

x = BatchNormalization(axis=channel_axis)(x)

x = Activation(relu6)(x)

x = Conv2D(filters, (1, 1), strides=(1, 1), padding='same')(x)

x = BatchNormalization(axis=channel_axis)(x)

if r:

x = add([x, inputs])

return x

def _inverted_residual_block(inputs, filters, kernel, t, strides, n):

"""Inverted Residual Block

This function defines a sequence of 1 or more identical layers.

# Arguments

inputs: Tensor, input tensor of conv layer.

filters: Integer, the dimensionality of the output space.

kernel: An integer or tuple/list of 2 integers, specifying the

width and height of the 2D convolution window.

t: Integer, expansion factor.

t is always applied to the input size.

s: An integer or tuple/list of 2 integers,specifying the strides

of the convolution along the width and height.Can be a single

integer to specify the same value for all spatial dimensions.

n: Integer, layer repeat times.

# Returns

Output tensor.

"""

x = _bottleneck(inputs, filters, kernel, t, strides)

for i in range(1, n):

x = _bottleneck(x, filters, kernel, t, 1, True)

return x

def MobileNetv2(input_shape, k):

"""MobileNetv2

This function defines a MobileNetv2 architectures.

# Arguments

input_shape: An integer or tuple/list of 3 integers, shape

of input tensor.

k: Integer, layer repeat times.

# Returns

MobileNetv2 model.

"""

inputs = Input(shape=input_shape)

x = _conv_block(inputs, 32, (3, 3), strides=(2, 2))

x = _inverted_residual_block(x, 16, (3, 3), t=1, strides=1, n=1)

x = _inverted_residual_block(x, 24, (3, 3), t=6, strides=2, n=2)

x = _inverted_residual_block(x, 32, (3, 3), t=6, strides=2, n=3)

x = _inverted_residual_block(x, 64, (3, 3), t=6, strides=2, n=4)

x = _inverted_residual_block(x, 96, (3, 3), t=6, strides=1, n=3)

x = _inverted_residual_block(x, 160, (3, 3), t=6, strides=2, n=3)

x = _inverted_residual_block(x, 320, (3, 3), t=6, strides=1, n=1)

x = _conv_block(x, 1280, (1, 1), strides=(1, 1))

x = GlobalAveragePooling2D()(x)

x = Reshape((1, 1, 1280))(x)

x = Dropout(0.3, name='Dropout')(x)

x = Conv2D(k, (1, 1), padding='same')(x)

x = Activation('softmax', name='softmax')(x)

output = Reshape((k,))(x)

model = Model(inputs, output)

plot_model(model, to_file='images/MobileNetv2.png', show_shapes=True)

return model

if __name__ == '__main__':

MobileNetv2((224, 224, 3), 1000)

训练

论文中推荐的输入大小为 224 * 224，因此训练集最好使用同样的大小. data\convert.py 文件提供了将cifar-100数据放大为224的例子.

训练数据集应该按照以下的格式配置:

| - data/

| - train/

| - class 0/

| - image.jpg

....

| - class 1/

....

| - class n/

| - validation/

| - class 0/

| - class 1/

....

| - class n/

运行下面的命令来训练模型:

python train.py --classes num_classes --batch batch_size --epochs epochs --size image_size

训练好的 .h5 权重文件保存在model文件夹.。如果想要在已有的模型上进行微调，可以使用下面的命令。但是需要注意，只能够改变最后一层输出的类别的个数，其他层的结构应该保持一致。

python train.py --classes num_classes --batch batch_size --epochs epochs --size image_size --weights weights_path --tclasses pre_classes

参数

--classes, 当前训练集的类别数。

--size, 图像大小。

--batch, batch size。

--epochs, epochs。

--weights, 需要fine tune的模型。

--tclasses, 训练好的模型中输出的类别数。

实验

由于条件限制，我们使用cifar-100数据库，在一定大小的epochs下进行实验。

device: Tesla K80

dataset: cifar-100

optimizer: Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08)

batch_szie: 128

实验细节如下，尽管网络没有完全收敛，但依然取得了不错的准确率。

Metrics

Loss

Top-1 Accuracy

Top-5 Accuracy

cifar-100

0.195

94.42%

99.82%

eva.png

weixin_39595537

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫