python keras_Python 3 & Keras 实现Mobilenet v2

MobileNet是Google提出来的移动端分类网络。在V1中,MobileNet应用了深度可分离卷积(Depth-wise Seperable Convolution)并提出两个超参来控制网络容量,这种卷积背后的假设是跨channel相关性和跨spatial相关性的解耦。深度可分离卷积能够节省参数量省,在保持移动端可接受的模型复杂性的基础上达到了相当的高精度。而在V2中,MobileNet应用了新的单元:Inverted residual with linear bottleneck,主要的改动是为Bottleneck添加了linear激活输出以及将残差网络的skip-connection结构转移到低维Bottleneck层。

网络结构

MobileNetV2的整体结构如下图所示。每行描述一个或多个相同(步长)层的序列,每个bottleneck重复n次。 相同序列中的所有层具有相同数量的输出通道。 每个序列的第一层有使用步长s,所有其他层使用步长1。所有的空间卷积使用3 * 3的内核。扩展因子t始终应用于输入大小。假设输入某一层的tensor的通道数为k,那么应用在这一层上的filters数就为 k * t。

net.jpg

Bottleneck的结构如下所示,根据使用的步长大小来决定是否使用skip-connection结构。

stru.jpg

环境

OpenCV 3.4

Python 3.5

Tensorflow-gpu 1.2.0

Keras 2.1.3

实现

基于论文给出的参数,我使用Keras 2实现了网络结构,如下所示:

from keras.models import Model

from keras.layers import Input, Conv2D, GlobalAveragePooling2D, Dropout

from keras.layers import Activation, BatchNormalization, add, Reshape

from keras.applications.mobilenet import relu6, DepthwiseConv2D

from keras.utils.vis_utils import plot_model

from keras import backend as K

def _conv_block(inputs, filters, kernel, strides):

"""Convolution Block

This function defines a 2D convolution operation with BN and relu6.

# Arguments

inputs: Tensor, input tensor of conv layer.

filters: Integer, the dimensionality of the output space.

kernel: An integer or tuple/list of 2 integers, specifying the

width and height of the 2D convolution window.

strides: An integer or tuple/list of 2 integers,

specifying the strides of the convolution along the width and height.

Can be a single integer to specify the same value for

all spatial dimensions.

# Returns

Output tensor.

"""

channel_axis = 1 if K.image_data_format() == 'channels_first' else -1

x = Conv2D(filters, kernel, padding='same', strides=strides)(inputs)

x = BatchNormalization(axis=channel_axis)(x)

return Activation(relu6)(x)

def _bottleneck(inputs, filters, kernel, t, s, r=False):

"""Bottleneck

This function defines a basic bottleneck structure.

# Arguments

inputs: Tensor, input tensor of conv layer.

filters: Integer, the dimensionality of the output space.

kernel: An integer or tuple/list of 2 integers, specifying the

width and height of the 2D convolution window.

t: Integer, expansion factor.

t is always applied to the input size.

s: An integer or tuple/list of 2 integers,specifying the strides

of the convolution along the width and height.Can be a single

integer to specify the same value for all spatial dimensions.

r: Boolean, Whether to use the residuals.

# Returns

Output tensor.

"""

channel_axis = 1 if K.image_data_format() == 'channels_first' else -1

tchannel = K.int_shape(inputs)[channel_axis] * t

x = _conv_block(inputs, tchannel, (1, 1), (1, 1))

x = DepthwiseConv2D(kernel, strides=(s, s), depth_multiplier=1, padding='same')(x)

x = BatchNormalization(axis=channel_axis)(x)

x = Activation(relu6)(x)

x = Conv2D(filters, (1, 1), strides=(1, 1), padding='same')(x)

x = BatchNormalization(axis=channel_axis)(x)

if r:

x = add([x, inputs])

return x

def _inverted_residual_block(inputs, filters, kernel, t, strides, n):

"""Inverted Residual Block

This function defines a sequence of 1 or more identical layers.

# Arguments

inputs: Tensor, input tensor of conv layer.

filters: Integer, the dimensionality of the output space.

kernel: An integer or tuple/list of 2 integers, specifying the

width and height of the 2D convolution window.

t: Integer, expansion factor.

t is always applied to the input size.

s: An integer or tuple/list of 2 integers,specifying the strides

of the convolution along the width and height.Can be a single

integer to specify the same value for all spatial dimensions.

n: Integer, layer repeat times.

# Returns

Output tensor.

"""

x = _bottleneck(inputs, filters, kernel, t, strides)

for i in range(1, n):

x = _bottleneck(x, filters, kernel, t, 1, True)

return x

def MobileNetv2(input_shape, k):

"""MobileNetv2

This function defines a MobileNetv2 architectures.

# Arguments

input_shape: An integer or tuple/list of 3 integers, shape

of input tensor.

k: Integer, layer repeat times.

# Returns

MobileNetv2 model.

"""

inputs = Input(shape=input_shape)

x = _conv_block(inputs, 32, (3, 3), strides=(2, 2))

x = _inverted_residual_block(x, 16, (3, 3), t=1, strides=1, n=1)

x = _inverted_residual_block(x, 24, (3, 3), t=6, strides=2, n=2)

x = _inverted_residual_block(x, 32, (3, 3), t=6, strides=2, n=3)

x = _inverted_residual_block(x, 64, (3, 3), t=6, strides=2, n=4)

x = _inverted_residual_block(x, 96, (3, 3), t=6, strides=1, n=3)

x = _inverted_residual_block(x, 160, (3, 3), t=6, strides=2, n=3)

x = _inverted_residual_block(x, 320, (3, 3), t=6, strides=1, n=1)

x = _conv_block(x, 1280, (1, 1), strides=(1, 1))

x = GlobalAveragePooling2D()(x)

x = Reshape((1, 1, 1280))(x)

x = Dropout(0.3, name='Dropout')(x)

x = Conv2D(k, (1, 1), padding='same')(x)

x = Activation('softmax', name='softmax')(x)

output = Reshape((k,))(x)

model = Model(inputs, output)

plot_model(model, to_file='images/MobileNetv2.png', show_shapes=True)

return model

if __name__ == '__main__':

MobileNetv2((224, 224, 3), 1000)

训练

论文中推荐的输入大小为 224 * 224,因此训练集最好使用同样的大小. data\convert.py 文件提供了将cifar-100数据放大为224的例子.

训练数据集应该按照以下的格式配置:

| - data/

| - train/

| - class 0/

| - image.jpg

....

| - class 1/

....

| - class n/

| - validation/

| - class 0/

| - class 1/

....

| - class n/

运行下面的命令来训练模型:

python train.py --classes num_classes --batch batch_size --epochs epochs --size image_size

训练好的 .h5 权重文件保存在model文件夹.。如果想要在已有的模型上进行微调,可以使用下面的命令。但是需要注意,只能够改变最后一层输出的类别的个数,其他层的结构应该保持一致。

python train.py --classes num_classes --batch batch_size --epochs epochs --size image_size --weights weights_path --tclasses pre_classes

参数

--classes, 当前训练集的类别数。

--size, 图像大小。

--batch, batch size。

--epochs, epochs。

--weights, 需要fine tune的模型。

--tclasses, 训练好的模型中输出的类别数。

实验

由于条件限制,我们使用cifar-100数据库,在一定大小的epochs下进行实验。

device: Tesla K80

dataset: cifar-100

optimizer: Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08)

batch_szie: 128

实验细节如下,尽管网络没有完全收敛,但依然取得了不错的准确率。

Metrics

Loss

Top-1 Accuracy

Top-5 Accuracy

cifar-100

0.195

94.42%

99.82%

eva.png

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值