目录
前言
yolox:https://github.com/Megvii-BaseDetection/YOLOX
yolox详细解读可参考:https://jishuin.proginn.com/p/763bfbd628ce
yolox网络结构
yolox Neck网络结构
可以发现,yolox Neck部分和yolox v3的neck是一样的,且都是fpn结构。对应的源码在文件yolo_fpn.py。
Neck组件
由上图可知有四种组件:
CBL:conv1x1的卷积+BN+LeakyReLU,不改变HW;
上采样:H,W各放大一倍;
Concat:channel维度拼接;
CBL*5:conv1x1, conv3x3, conv1x1, conv3x3, conv1x1
Neck组件源码
CBL
self.out1_cbl = self._make_cbl(512, 256, 1)
def _make_cbl(self, _in, _out, ks):
return BaseConv(_in, _out, ks, stride=1, act="lrelu")
上采样
self.upsample = nn.Upsample(scale_factor=2, mode="nearest")
Concat
x1_in = torch.cat([x1_in, x1], 1)
CBL*5
self.out1 = self._make_embedding([256, 512], 512 + 256)
def _make_embedding(self, filters_list, in_filters): # filters_list是掩藏层通道数,in_filters是输入通道数
m = nn.Sequential(
*[
self._make_cbl(in_filters, filters_list[0], 1),
self._make_cbl(filters_list[0], filters_list[1], 3),
self._make_cbl(filters_list[1], filters_list[0], 1),
self._make_cbl(filters_list[0], filters_list[1], 3),
self._make_cbl(filters_list[1], filters_list[0], 1),
]
)
return m
Neck FPN源码实现
def forward(self, inputs):
"""
Args:
inputs (Tensor): input image.
Returns:
Tuple[Tensor]: FPN output features..
"""
# backbone
out_features = self.backbone(inputs)
x2, x1, x0 = [out_features[f] for f in self.in_features] # 缩放倍数逐渐变大。["dark3", "dark4", "dark5"]
# yolo branch 1,中间尺度输出
x1_in = self.out1_cbl(x0) # 最小尺度特征x0,下采样了5次。这里先是1x1的卷积。
x1_in = self.upsample(x1_in) # 然后上采样,得到x0的中间尺度
x1_in = torch.cat([x1_in, x1], 1) # 然后,将x0的中间尺度,和中间尺度x1,通道维度拼接。
out_dark4 = self.out1(x1_in) # 最后是5个CBL。中间尺度out_dark4输出
# yolo branch 2,最大尺度输出
x2_in = self.out2_cbl(out_dark4) # 1x1卷积
x2_in = self.upsample(x2_in) # 上采样
x2_in = torch.cat([x2_in, x2], 1) # 拼接
out_dark3 = self.out2(x2_in) # 5个CBL
outputs = (out_dark3, out_dark4, x0) # 特征图尺度逐渐变小。
return outputs