任务描述
本关任务:阅读下面的相关知识,理解YOLO V1的网络框架。
相关知识
网络设计
网络结构借鉴了 GoogLeNet
。24个卷积层,2个全链接层。(用1×1 reduction layers
紧跟 3×3 convolutional layers
取代Goolenet
的 inception modules
)
网络结构
YOLO检测网络包含24个卷积层(用来提取特征)和2个全联接层(用来预测图像位置和类别置信度),并且使用了大量的1×1的卷积用来降低上一层的layer到下一层的特征空间。并且在paper
中,作者还给出了fast-YOLO
的构架,即:9个卷积层和2个全连接层。使用titan x GPU
,fast YOLO可以达到155fps
的检测速度,但是mAP
值也从YOLO的63.4% 降到了 52.7% ,但却仍然远高于以往的实时物体检测方法(DPM
)的mAP
值。
(这里一开始我数不出来为什么有24个卷积层,其实要仔细看图,你会发现有一些“×4”“×2”,把这些考虑进去你就会发现确实是24层)
最后得到的7×7×30代表的是最后的输出,代表一共49个网格,每个网格拥有30个值,其中有20个值为类别概率值,即该网格检测出来的属于某类物体的概率。而剩下的10个值可以分成两部分,分表代表网格两个Bounding Box
各自的参数部分。 我们取一个cell
来看,一个cell
有30个元素,如图所示:
网络参数展示图:
使用keras实现网络结构:
# 卷积层
# 根据卷积计算,输出形状:(55, 55, 64)
tf.keras.layers.Conv2D(64,(6, 6), strides=4, input_shape=(227, 227, 3), activation=tf.keras.layers.LeakyReLU(0.1)),
# 扁平化操作(`Flatten`):传递给全连接层的卷积层输出必须先扁平化,然后才能全连接层接受输入。
# 输出形状:(193600, )
tf.keras.layers.Flatten(),
# 全连接层
# 输出形状:(4096, )
tf.keras.layers.Dense(4096, activation=tf.keras.layers.LeakyReLU(0.1)),
# Dropout防止过拟合
tf.keras.layers.Dropout(0.6),
# 全连接层
# 输出形状:(30, )
tf.keras.layers.Dense(30, activation=tf.keras.layers.LeakyReLU(0.1))
编程要求
根据左侧内容提示,在右侧编辑器补充YOLO V1网络框架代码。
测试说明
平台会对你编写的代码进行测试:
测试输入: 略
预期输出: 网络结构:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 112, 112, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 112, 112, 192) 110784
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 56, 56, 192) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 56, 56, 128) 24704
_________________________________________________________________
conv2d_4 (Conv2D) (None, 56, 56, 256) 295168
_________________________________________________________________
conv2d_5 (Conv2D) (None, 56, 56, 256) 65792
_________________________________________________________________
conv2d_6 (Conv2D) (None, 56, 56, 512) 1180160
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 28, 28, 512) 0
_________________________________________________________________
conv2d_7 (Conv2D) (None, 28, 28, 256) 131328
_________________________________________________________________
conv2d_8 (Conv2D) (None, 28, 28, 512) 1180160
_________________________________________________________________
conv2d_9 (Conv2D) (None, 28, 28, 256) 131328
_________________________________________________________________
conv2d_10 (Conv2D) (None, 28, 28, 512) 1180160
_________________________________________________________________
conv2d_11 (Conv2D) (None, 28, 28, 256) 131328
_________________________________________________________________
conv2d_12 (Conv2D) (None, 28, 28, 512) 1180160
_________________________________________________________________
conv2d_13 (Conv2D) (None, 28, 28, 256) 131328
_________________________________________________________________
conv2d_14 (Conv2D) (None, 28, 28, 512) 1180160
_________________________________________________________________
conv2d_15 (Conv2D) (None, 28, 28, 512) 262656
_________________________________________________________________
conv2d_16 (Conv2D) (None, 28, 28, 1024) 525312
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 14, 14, 1024) 0
_________________________________________________________________
conv2d_17 (Conv2D) (None, 14, 14, 512) 524800
_________________________________________________________________
conv2d_18 (Conv2D) (None, 14, 14, 1024) 4719616
_________________________________________________________________
conv2d_19 (Conv2D) (None, 14, 14, 512) 524800
_________________________________________________________________
conv2d_20 (Conv2D) (None, 14, 14, 1024) 4719616
_________________________________________________________________
conv2d_21 (Conv2D) (None, 14, 14, 1024) 9438208
_________________________________________________________________
conv2d_22 (Conv2D) (None, 7, 7, 1024) 9438208
_________________________________________________________________
conv2d_23 (Conv2D) (None, 7, 7, 1024) 9438208
_________________________________________________________________
conv2d_24 (Conv2D) (None, 7, 7, 1024) 9438208
_________________________________________________________________
flatten_1 (Flatten) (None, 50176) 0
_________________________________________________________________
dense_1 (Dense) (None, 4096) 205524992
_________________________________________________________________
dropout_1 (Dropout) (None, 4096) 0
_________________________________________________________________
dense_2 (Dense) (None, 1470) 6022590
=================================================================
Total params: 267,501,566
Trainable params: 267,501,566
Non-trainable params: 0
开始你的任务吧,祝你成功!
参考代码:
import tensorflow as tf
def createYOLO_v1_Model(tiny=True):
if tiny:
# 序贯模型(Sequential):单输入单输出,一条路通到底,层与层之间只有相邻关系,没有跨层连接。
models = tf.keras.Sequential([
# 卷积层相关参数:卷积核64个,卷积核大小为3*3,步长为2,填充为same,输入图片尺寸为448x448x3,而激活函数为LeakyReLU,超参数为0.1
tf.keras.layers.Conv2D(64, (3, 3), strides=2, padding='same', input_shape=(448, 448, 3), activation=tf.keras.layers.LeakyReLU(0.1)),
# 池化层相关参数:池化大小为2*2,步长为2
tf.keras.layers.MaxPooling2D((2, 2), strides=2),
tf.keras.layers.Conv2D(192, (3, 3), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.MaxPooling2D((2, 2), strides=2),
tf.keras.layers.Conv2D(128, (1, 1), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.Conv2D(256, (3, 3), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.Conv2D(256, (1, 1), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.Conv2D(512, (3, 3), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
# 请在此添加代码,根据左侧展示图补充池化层代码
########## Begin ##########
tf.keras.layers.MaxPooling2D((2, 2), strides=2),
########## End ##########
tf.keras.layers.Conv2D(256, (1, 1), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.Conv2D(512, (3, 3), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.Conv2D(256, (1, 1), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.Conv2D(512, (3, 3), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.Conv2D(256, (1, 1), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.Conv2D(512, (3, 3), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.Conv2D(256, (1, 1), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.Conv2D(512, (3, 3), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.Conv2D(512, (1, 1), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.Conv2D(1024, (1, 1), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.MaxPooling2D((2, 2), strides=2),
tf.keras.layers.Conv2D(512, (1, 1), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.Conv2D(1024, (3, 3), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.Conv2D(512, (1, 1), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.Conv2D(1024, (3, 3), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.Conv2D(1024, (3, 3), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
# 请在此添加代码,根据左侧展示图补充卷积层代码
########## Begin ##########
tf.keras.layers.Conv2D(1024, (3, 3), strides=2, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.Conv2D(1024, (3, 3), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
tf.keras.layers.Conv2D(1024, (3, 3), strides=1, padding='same', activation=tf.keras.layers.LeakyReLU(0.1)),
########## End ##########
# 请在此添加代码,根据左侧内容提示补充扁平层代码
########## Begin ##########
tf.keras.layers.Flatten(),
########## End ##########
# 全连接层
tf.keras.layers.Dense(4096, activation=tf.keras.layers.LeakyReLU(0.1)),
# 请在此添加代码,根据左侧内容提示补充Dropout层代码,超参数是0.5,可以缓解过拟合操作
########## Begin ##########
tf.keras.layers.Dropout(0.5),
########## End ##########
# 输出一个7*7*30的张量
tf.keras.layers.Dense(7*7*30)])
return models
model = createYOLO_v1_Model()
print('网络结构:')
model.summary()