一. YOLOv1简介
- 是个One stage的网络结构(端到端)
- 作用:用于目标检测
- 特点:比RCNN系列要快
二. 网络结构
YOLOv1把目标检测问题当作回归问题来处理,它的操作流程如下所述:
- 把原图分为SxS个区域(grid);
- 然后计算每个grid里的预测的bounding-box的(B * 5 + C)个值
- B * 5 + C意义:
B表示预测的bounding-box个数
5表示bounding-box的中心坐标,长宽,confidence值(IOU值),一共5个值
C表示你要预测的每个类别的概率
论文中设置S=7, B=2, C=20,所以最后的输出为 7x7x(2x5 + 20) = 7 x 7 x 30
YOLOv1的网络结构如下图所示:
- 输入shape:448x448;
- 中间有48个卷积层,经过4个max-pooling之后,feature map的大小变为7x7,然后接两个全连接层,再reshape成输出的shape
- 输出shape:7x7x30
三. Keras搭建模型
一些信息:
- 这里的s-2应该是表示卷积层的stride=2
- 模型经过pooling后,feature map的size才会变为原来的一半,经过卷积层后size是不变的,所以padding的方式是’same’
- 最后一层是线性激活函数(就是啥都不做),其他层后面接leaky ReLU,斜率为0.1
-
卷积层后面跟两个全连接层,一层输出4096维(如图上所示),另一层的维度应该是7x7x30 = 1470, 再reshape成输出的shape。
-
-
模型定义代码如下:
import keras
from keras.layers import *
from keras.models import *
model = Sequential()
# Block 1
model.add(Conv2D(64, (7, 7), strides=(2, 2), padding='same', input_shape=(448, 448, 3)))
model.add(LeakyReLU(alpha=0.1))
model.add(MaxPool2D(2, 2))
# Block 2
model.add(Conv2D(192, (3, 3), padding='same', ))
model.add(LeakyReLU(alpha=0.1))
model.add(MaxPooling2D(2, 2))
model.add(Dropout(0.5))
# Block 3
model.add(Conv2D(128, (1, 1), padding='same',))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(256, (3, 3), padding='same',))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(256, (1, 1), padding='same',))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(512, (3, 3), padding='same',))
model.add(LeakyReLU(alpha=0.1))
model.add(MaxPooling2D(2, 2))
model.add(Dropout(0.5))
#Block 4
model.add(Conv2D(256, (1, 1), padding='same', ))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(512, (3, 3), padding='same', ))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(256, (1, 1), padding='same', ))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(512, (3, 3), padding='same', ))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(256, (1, 1), padding='same', ))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(512, (3, 3), padding='same', ))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(256, (1, 1), padding='same', ))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(512, (3, 3), padding='same'))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(512, (1, 1), padding='same'))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(1024, (3, 3), padding='same'))
model.add(LeakyReLU(alpha=0.1))
model.add(MaxPooling2D(2, 2))
model.add(Dropout(0.5))
#Block 5
model.add(Conv2D(512, (1, 1), padding='same'))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(1024, (3, 3), padding='same'))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(512, (1, 1), padding='same'))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(1024, (3, 3), padding='same'))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(1024, (3, 3), padding='same'))
model.add(LeakyReLU(alpha=0.1))
model.add(Conv2D(1024, (3, 3), padding='same'))
model.add(LeakyReLU(alpha=0.1))
model.add(MaxPooling2D(2, 2))
model.add(Dropout(0.5))
#Block 6
model.add(Flatten())
model.add(Dense(4096))
model.add(LeakyReLU(alpha=0.1))
model.add(Dense(7*7*30))
model.add(LeakyReLU(alpha=0.1))
model.add(Reshape((7, 7, 30)))
结束。