目标检测 YOLOv5 Backbone中的Focus
flyfish
版本:YOLOv5
YOLOv5:v5
第一个v5是YOLO的第5个版本,YOLOv5也是在迭代改进的,第二v5是YOLOv5中的第5个版本
与YOLOv5:v4还是有差别的。
这里以YOLOv5s版说明,s是最小的那版,其他的类似yolov5m、yolov5l、yolov5x是模型依次变大的版本
Backbone的开头就是Focus,Focus中包含了标准卷积
标准卷积(Standard convolution) = Conv2d + BatchNorm2d +SiLU
详细写如下
(conv): Conv2d(12, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(32, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU()
标准卷积代码在common.py
class Conv(nn.Module):
# Standard convolution
def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True): # ch_in, ch_out, kernel, stride, padding, groups
super(Conv, self).__init__()
self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False)
self.bn = nn.BatchNorm2d(c2)
self.act = nn.SiLU() if act is True else (act if isinstance(act, nn.Module) else nn.Identity())
def forward(self, x):
return self.act(self.bn(self.conv(x)))
def fuseforward(self, x):
return self.act(self.conv(x))
通过输出的配置看Focus
from n params module arguments
0 -1 1 3520 models.common.Focus [3, 32, 3]
1 -1 1 18560 models.common.Conv [32, 64, 3, 2]
Focus的输入tensor的形状是(b,c,w,h)输出tensor的形状是(b,4c,w/2,h/2)
YOLOv5s假设一张图像,默认b,c,w,h也就是1x3x640x640的输入,这里简写3x640x640先复制四份,然后通过切片操作将这个四张图像切成了四个3x320x320,接下来使用concat从深度上连接这四个切片,输出为12x320x320,之后再通过卷积核数为32的卷积层,生成32x320x320的输出,最后经过BatchNorm2d 和SiLU将结果输入到下一个卷积层。
Focus默认的输出通道是64,而yolov5s里的放缩系数是0.5,输出通道为32。make_divisible()函数保证输出通道是8的倍数.
make_divisible代码在general.py
def make_divisible(x, divisor):
# Returns x evenly divisible by divisor
return math.ceil(x / divisor) * divisor
Focus代码在common.py
class Focus(nn.Module):
# Focus wh information into c-space
def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True): # ch_in, ch_out, kernel, stride, padding, groups
super(Focus, self).__init__()
self.conv = Conv(c1 * 4, c2, k, s, p, g, act)
# self.contract = Contract(gain=2)
def forward(self, x): # x(b,c,w,h) -> y(b,4c,w/2,h/2)
return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1))
# return self.conv(self.contract(x))
对比张量连接方式
import torch
x = torch.arange(1,9)
x = x.reshape(2,4)
print(x)
# tensor([[1, 2, 3, 4],
# [5, 6, 7, 8]])
#不同维度的张量连接
print(torch.cat((x, x, x), 0))#按0维连接
# tensor([[1, 2, 3, 4],
# [5, 6, 7, 8],
# [1, 2, 3, 4],
# [5, 6, 7, 8],
# [1, 2, 3, 4],
# [5, 6, 7, 8]])
print(torch.cat((x, x, x), 1))#按1维连接
# tensor([[1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4],
# [5, 6, 7, 8, 5, 6, 7, 8, 5, 6, 7, 8]])