yolov5网络结构分析

最新推荐文章于 2024-03-25 14:27:48 发布

MingX.Zhao

最新推荐文章于 2024-03-25 14:27:48 发布

阅读量870

点赞数

分类专栏：深度学习——目标检测文章标签： pytorch 深度学习 python

本文链接：https://blog.csdn.net/newt_scamander/article/details/122265743

版权

深度学习——目标检测专栏收录该内容

3 篇文章 0 订阅

订阅专栏

第二次分析（第一次的代码以找回，在下面一节）

在yolo.py里面加入如下代码并运行，注意在前面用sys.path.appen()加入utils和models文件夹的路径。

if __name__ == '__main__':
    cfg = 'yolov5s.yaml'
    ch = 3
    import yaml
    with open(cfg, encoding='ascii', errors='ignore') as f:
        cfg = yaml.safe_load(f)  # 
    model,save = parse_model(cfg,[ch])
    y = []
    x = torch.rand(4,3,640,640)
    for m in model:
        print(f'layer name:{m.type}, from :{m.f} index: {m.i}')
        if m.f != -1 :
            x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f]
        if isinstance(x,torch.Tensor):
            print(f'shape of input: {x.shape }')
        else:
            for i in range(len(x)):
                print(f'shape of input: {x[i].shape}')
        x = m(x)
        if isinstance(x,torch.Tensor):
            print(f'shape of output:{x.shape}')
        else:
            for i in range(len(x)):
                print(f'shape of output:{x[i].shape}')
        y.append(x if m.i in save else None)

输出结果为

layer name:models.common.Conv, from :-1 index: 0
shape of input: torch.Size([4, 3, 640, 640])
shape of output:torch.Size([4, 32, 320, 320])
降维卷积层，通道增加
layer name:models.common.Conv, from :-1 index: 1
shape of input: torch.Size([4, 32, 320, 320])
shape of output:torch.Size([4, 64, 160, 160])
降维卷积层，通道增加
layer name:models.common.C3, from :-1 index: 2
shape of input: torch.Size([4, 64, 160, 160])
shape of output:torch.Size([4, 64, 160, 160])
layer name:models.common.Conv, from :-1 index: 3
shape of input: torch.Size([4, 64, 160, 160])
shape of output:torch.Size([4, 128, 80, 80])
layer name:models.common.C3, from :-1 index: 4
shape of input: torch.Size([4, 128, 80, 80])
shape of output:torch.Size([4, 128, 80, 80])
layer name:models.common.Conv, from :-1 index: 5
shape of input: torch.Size([4, 128, 80, 80])
shape of output:torch.Size([4, 256, 40, 40])
layer name:models.common.C3, from :-1 index: 6
shape of input: torch.Size([4, 256, 40, 40])
shape of output:torch.Size([4, 256, 40, 40])
layer name:models.common.Conv, from :-1 index: 7
shape of input: torch.Size([4, 256, 40, 40])
shape of output:torch.Size([4, 512, 20, 20])
layer name:models.common.C3, from :-1 index: 8
shape of input: torch.Size([4, 512, 20, 20])
shape of output:torch.Size([4, 512, 20, 20])
layer name:models.common.SPPF, from :-1 index: 9
shape of input: torch.Size([4, 512, 20, 20])
shape of output:torch.Size([4, 512, 20, 20])
layer name:models.common.Conv, from :-1 index: 10
shape of input: torch.Size([4, 512, 20, 20])
shape of output:torch.Size([4, 256, 20, 20])
layer name:torch.nn.modules.upsampling.Upsample, from :-1 index: 11
shape of input: torch.Size([4, 256, 20, 20])
shape of output:torch.Size([4, 256, 40, 40])
layer name:models.common.Concat, from :[-1, 6] index: 12
shape of input: torch.Size([4, 256, 40, 40])
shape of input: torch.Size([4, 256, 40, 40])
shape of output:torch.Size([4, 512, 40, 40])
layer name:models.common.C3, from :-1 index: 13
shape of input: torch.Size([4, 512, 40, 40])
shape of output:torch.Size([4, 256, 40, 40])
layer name:models.common.Conv, from :-1 index: 14
shape of input: torch.Size([4, 256, 40, 40])
shape of output:torch.Size([4, 128, 40, 40])
layer name:torch.nn.modules.upsampling.Upsample, from :-1 index: 15
shape of input: torch.Size([4, 128, 40, 40])
shape of output:torch.Size([4, 128, 80, 80])
layer name:models.common.Concat, from :[-1, 4] index: 16
shape of input: torch.Size([4, 128, 80, 80])
shape of input: torch.Size([4, 128, 80, 80])
shape of output:torch.Size([4, 256, 80, 80])
layer name:models.common.C3, from :-1 index: 17
shape of input: torch.Size([4, 256, 80, 80])
shape of output:torch.Size([4, 128, 80, 80])
layer name:models.common.Conv, from :-1 index: 18
shape of input: torch.Size([4, 128, 80, 80])
shape of output:torch.Size([4, 128, 40, 40])
layer name:models.common.Concat, from :[-1, 14] index: 19
shape of input: torch.Size([4, 128, 40, 40])
shape of input: torch.Size([4, 128, 40, 40])
shape of output:torch.Size([4, 256, 40, 40])
layer name:models.common.C3, from :-1 index: 20
shape of input: torch.Size([4, 256, 40, 40])
shape of output:torch.Size([4, 256, 40, 40])
layer name:models.common.Conv, from :-1 index: 21
shape of input: torch.Size([4, 256, 40, 40])
shape of output:torch.Size([4, 256, 20, 20])
layer name:models.common.Concat, from :[-1, 10] index: 22
shape of input: torch.Size([4, 256, 20, 20])
shape of input: torch.Size([4, 256, 20, 20])
shape of output:torch.Size([4, 512, 20, 20])
layer name:models.common.C3, from :-1 index: 23
shape of input: torch.Size([4, 512, 20, 20])
shape of output:torch.Size([4, 512, 20, 20])
layer name:Detect, from :[17, 20, 23] index: 24
shape of input: torch.Size([4, 128, 80, 80])
shape of input: torch.Size([4, 256, 40, 40])
shape of input: torch.Size([4, 512, 20, 20])
shape of output:torch.Size([4, 3, 80, 80, 85])
shape of output:torch.Size([4, 3, 40, 40, 85])
shape of output:torch.Size([4, 3, 20, 20, 85])

第一次分析

yolov5s

只有最后一个参数为2时才降维，stride，kernelsize和padding已经补全

第一个参数为in_channel,第二个为out_channel

Foucs(3,32,3) # 降维

Conv(32,64,3,2) # 降维

BottleneckCSP(64,64,1) # 不降

Conv(64,128,3,2) # 降维

BottleneckCSP(128,128,3) # 不降

Conv(128,256,3,2) # 降维

BottleneckCSP(256,256,3) # 不降

Conv(256,512,3,2) #降维

SPP(512,512,[5,9,13])#不降

BottleneckCSP(512,512,1，False) # 不降这个就是CSP2_X,其中shortcuts=False就是只剩下两层Conv

Conv(512,256,1,1) #不降

Upsample(None,2,‘nearest’] #升维，没有可学习的参数

Concat(1) # 维度不变，通道增加

BottleneckCSP(512,256,1，False)

Conv(256,128,1,1)

Upsample(None,2,‘nearest’] #升维，没有可学习的参数

Concat(1) # 维度不变，通道增加

BottleneckCSP(256,128,1，False)

Conv(128,128,3,2) #降维，第一条路往下走的分支

Concat(1)

BottleneckCSP(256,256,1，False)

Conv(256,256,3,2) #第二条路往下走的分支

Concat(1) #通道增加

BottleneckCSP(512,512,1，False)

Detect

from[17,20,23],最后传入Detect的是这三层的输出结果的列表
Detect输出的形状为（batch_size,number of anchors, width, height, number of outputs）

例如(4,3,16,16,8)

4个batch size, 3个anchors，特征图维度为16x16，类别为3，输出标签为3+5=8


def autopad(k, p=None):  # kernel, padding
    # Pad to 'same'
    if p is None:
        p = k // 2 if isinstance(k, int) else [x // 2 for x in k]  # auto-pad
    return p
class Conv(nn.Module):
    # Standard convolution
    def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True):  # ch_in, ch_out, kernel, stride, padding, groups
        super(Conv, self).__init__()
        self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False)
        self.bn = nn.BatchNorm2d(c2)
        self.act = nn.LeakyReLU(0.1, inplace=True) if act else nn.Identity()

    def forward(self, x):
        return self.act(self.bn(self.conv(x)))

    def fuseforward(self, x):
        return self.act(self.conv(x))
class Focus(nn.Module):
    # Focus wh information into c-space
    def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True):  # ch_in, ch_out, kernel, stride, padding, groups
        super(Focus, self).__init__()
        self.conv = Conv(c1 * 4, c2, k, s, p, g, act)

    def forward(self, x):  # x(b,c,w,h) -> y(b,4c,w/2,h/2)
        return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1))
f = Focus(3,32,3)
x = torch.randn(4, 3, 16, 16)
out = f(x)
out.shape
# x[..., ::2, ::2].shape
#间隔取，从0开始，后面三个一样的意思，例如第二个1::2，就是在第三个维度间隔取，从1开始，可以参考slice的图片讲解