失踪人口正式回归一小会,想起来自己还有一个微信公众号,今天心血来潮更新一下自己最近在做的事情。最近小陈想利用YOLOv5这该死的宝藏项目来做自己的小论文内容,水论文,当然就只能简单改动里面的东西。这不,我想先试试改改backbone部分。
问题:YOLOv5的模型定义在yaml格式的配置文件中,这与代码原作者的YOLOv3代码不太相同(YOLOv3中使用的是cfg格式的配置文件),我们想改动网络的backbone而又想最低限度的改动代码,当然就只能自己根据理解自定义编写yaml格式的网络模型。
yaml内容如下:
# parametersnc: 80 # number of classesdepth_multiple: 0.33 # model depth multiplewidth_multiple: 0.50 # layer channel multiple# anchorsanchors: - [10,13, 16,30, 33,23] # P3/8 - [30,61, 62,45, 59,119] # P4/16 - [116,90, 156,198, 373,326] # P5/32# YOLOv5 backbonebackbone: # [from, number, module, args] [[-1, 1, Focus, [64, 3]], # 0-P1/2 [-1, 1, Conv, [128, 3, 2]], # 1-P2/4 [-1, 3, BottleneckCSP, [128]], # 2 [-1, 1, Conv, [256, 3, 2]], # 3-P3/8 [-1, 9, BottleneckCSP, [256]], # 4 [-1, 1, Conv, [512, 3, 2]], # 5-P4/16 [-1, 9, BottleneckCSP, [512]], # 6 [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32 [-1, 1, SPP, [1024, [5, 9, 13]]], # 8 [-1, 3, BottleneckCSP, [1024, False]], # 9 ]# YOLOv5 head PANethead: [[-1, 1, Conv, [512, 1, 1]], [-1, 1, nn.Upsample, [None, 2, 'nearest']], [[-1, 6], 1, Concat, [1]], # cat backbone P4 [-1, 3, BottleneckCSP, [512, False]], # 13 [-1, 1, Conv, [256, 1, 1]], [-1, 1, nn.Upsample, [None, 2, 'nearest']], [[-1, 4], 1, Concat, [1]], # cat backbone P3 [-1, 3, BottleneckCSP, [256, False]], # 17 (P3/8-small) [-1, 1, Conv, [256, 3, 2]], # PAN's Downsample [[-1, 14], 1, Concat, [1]], # cat head P4 [-1, 3, BottleneckCSP, [512, False]], # 20 (P4/16-medium) [-1, 1, Conv, [512, 3, 2]], [[-1, 10], 1, Concat, [1]], # cat head P5 [-1, 3, BottleneckCSP, [1024, False]], # 23 (P5/32-large) [[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5) ]
yolov5的网络结构图如下所示(图片来源于网络,侵权就删):
def parse_model(d, ch): # model_dict, input_channels(3) logger.info('\n%3s%18s%3s%10s %-40s%-30s' % ('', 'from', 'n', 'params', 'module', 'arguments')) anchors, nc, gd, gw = d['anchors'], d['nc'], d['depth_multiple'], d['width_multiple'] na = (len(anchors[0]) // 2) if isinstance(anchors, list) else anchors # number of anchors no = na * (nc + 5) # number of outputs = anchors * (classes + 5) layers, save, c2 = [], [], ch[-1] # layers, savelist, ch out for i, (f, n, m, args) in enumerate(d['backbone'] + d['head']): # from, number, module, args m = eval(m) if isinstance(m, str) else m # eval strings for j, a in enumerate(args): try: args[j] = eval(a) if isinstance(a, str) else a # eval strings except: pass n = max(round(n * gd), 1) if n > 1 else n # depth gain if m in [Conv, ResNet_Conv, Bottleneck, SPP, DWConv, MixConv2d, Focus, CrossConv, BottleneckCSP, C3]: c1, c2 = ch[f], args[0] # Normal # if i > 0 and args[0] != no: # channel expansion factor # ex = 1.75 # exponential (default 2.0) # e = math.log(c2 / ch[1]) / math.log(2) # c2 = int(ch[1] * ex ** e) # if m != Focus: c2 = make_divisible(c2 * gw, 8) if c2 != no else c2 # Experimental # if i > 0 and args[0] != no: # channel expansion factor # ex = 1 + gw # exponential (default 2.0) # ch1 = 32 # ch[1] # e = math.log(c2 / ch1) / math.log(2) # level 1-n # c2 = int(ch1 * ex ** e) # if m != Focus: # c2 = make_divisible(c2, 8) if c2 != no else c2 args = [c1, c2, *args[1:]] if m in [BottleneckCSP, C3]: args.insert(2, n) n = 1 if m in [IndentityBlock]: print("args") print(args) elif m in [ConvBlock, IndentityBlock]: c1, c2 = ch[f], args[2] c2 = make_divisible(c2 * gw, 8) if c2 != no else c2 args = [c1, c2, *args[1:]] elif m is nn.BatchNorm2d: args = [ch[f]] elif m is Concat: c2 = sum([ch[-1 if x == -1 else x + 1] for x in f]) elif m is Detect: args.append([ch[x + 1] for x in f]) if isinstance(args[1], int): # number of anchors args[1] = [list(range(args[1] * 2))] * len(f) else: c2 = ch[f] m_ = nn.Sequential(*[m(*args) for _ in range(n)]) if n > 1 else m(*args) # module t = str(m)[8:-2].replace('__main__.', '') # module type np = sum([x.numel() for x in m_.parameters()]) # number params m_.i, m_.f, m_.type, m_.np = i, f, t, np # attach index, 'from' index, type, number params logger.info('%3s%18s%3s%10.0f %-40s%-30s' % (i, f, n, np, t, args)) # print save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1) # append to savelist layers.append(m_) ch.append(c2) return nn.Sequential(*layers), sorted(save)
看懂上面的代码,我想问题改动模型应该问题就不大了。parse_model方法读取yaml文件的内容并进行解析,这里只提一下方法,我们配合着去看源代码中common.py中Focus层、Conv层、BottleneckCSP层、SPP层、Concat层和Detect层的结构,不同的网络层,初始化所需的参数不同,而参数我们都已经定义在了yolov5s.yaml中,只不过先需要对其进行解读,然后转换成合适的格式,就可以还原出yolov5的模型啦。这里我也不好描述。。。
这次,我把YOLOv5中的backbone——CSPDarknet53替换为了ResNet50,网络的neck部分依旧使用PANet多尺度提取特征。
ResNet50的网络结构如下所示(图片来源于网络,侵权就删):
对照着,我就写出了ResNet50的yaml版结构。
# parametersnc: 24 # number of classesdepth_multiple: 1.0 # model depth multiplewidth_multiple: 1.0 # layer channel multiple# anchorsanchors: - [10,13, 16,30, 33,23] # P3/8 - [30,61, 62,45, 59,119] # P4/16 - [116,90, 156,198, 373,326] # P5/32# YOLOv5 backbone_resnet50backbone: # [from, number, module, args] [[-1, 1, ResNet_Conv, [32, 7, 2]], # 0 kernel_size = 7, stride = 2, padding = 3 auto pad [-1, 1, ConvBlock, [32, 32, 128, 1]], # 1 [-1, 2, IndentityBlock, [128, 32, 128]], # 2 [-1, 1, ConvBlock, [64, 64, 256, 2]], # 3 [-1, 2, IndentityBlock, [64, 64, 256]], # 4 [-1, 1, ConvBlock, [128, 128, 512, 2]], # 5 [-1, 5, IndentityBlock, [128, 128, 512]], # 6 [-1, 1, ConvBlock, [256, 256, 1024, 2]], # 7 [-1, 2, IndentityBlock, [256, 256, 1024]], # 8 # change resnet 2048 channel output 1024 channel output ]# YOLOv5 head PANethead: [[-1, 1, Conv, [512, 1, 1]], [-1, 1, nn.Upsample, [None, 2, 'nearest']], [[-1, 6], 1, Concat, [1]], [-1, 3, BottleneckCSP, [512, False]], [-1, 1, Conv, [256, 1, 1]], [-1, 1, nn.Upsample, [None, 2, 'nearest']], [[-1, 4], 1, Concat, [1]], [-1, 3, BottleneckCSP, [256, False]], [-1, 1, Conv, [256, 3, 2]], [[-1, 13], 1, Concat, [1]], [-1, 3, BottleneckCSP, [512, False]], [-1, 1, Conv, [512, 3, 2]], [[-1, 9], 1, Concat, [1]], [-1, 3, BottleneckCSP, [1024, False]], [[16, 19, 22], 1, Detect, [nc, anchors]], ]
后续,我还会尝试多种backbone的替换,不然没有东西写啊哈哈哈。ResNet50,ResNeXt50,CSPResNet50,CSPResNeXt50等等都是可以的,还可以在残差块中加入attention module等等,后续小陈做了的话,推送一下就ok啦。