Pytorch 目标检测学习 Day 7

最新推荐文章于 2025-02-05 16:57:57 发布

兜兜转转m

最新推荐文章于 2025-02-05 16:57:57 发布

阅读量309

点赞数

分类专栏：深度学习 pytorch

本文链接：https://blog.csdn.net/abc123mma/article/details/112183798

版权

深度学习同时被 2 个专栏收录

19 篇文章

订阅专栏

pytorch

14 篇文章

订阅专栏

DetNet：为检测而生

出现问题：

而图像分类与物体检测两个任务天然存在着落差，分类任务侧重于全图的特征提取，深层的特征图分辨率很低；而物体检测需要定位出物体位置，特征图分辨率不宜过小，因此造成了以下两种缺陷：

大物体难以定位：对于FPN等网络，大物体对应在较深的特征图上检测，由于网络较深时下采样率较大，物体的边缘难以精确预测，增加了回归边界的难度。
小物体难以检测：对于传统网络，由于下采样率大造成小物体在较深的特征图上几乎不可见；FPN虽从较浅的特征图来检测小物体，但浅层的语义信息较弱，且融合深层特征时使用的上采样操作也会增加物体检测的难度。

DetNet网络结构：

仍然选择性能优越的ResNet-50作为基础结构，并保持前4个stage与ResNet-50相同，具体的结构细节有以下3点

引入了一个新的Stage 6，用于物体检测。Stage 5与Stage 6使用了DetNet提出的Bottleneck结构，最大的特点是利用空洞数为2的3×3卷积取代了步长为2的3×3卷积。
Stage 5与Stage 6的每一个Bottleneck输出的特征图尺寸都为原图的1/16，通道数都为256，而传统的Backbone通常是特征图尺寸递减，通道数递增。
在组成特征金字塔时，由于特征图大小完全相同，因此可以直接从右向左传递相加，避免了上一节的上采样操作。为了进一步融合各通道的特征，需要对每一个阶段的输出进行1×1卷积后再与后一Stage传回的特征相加。

图1

DetNet中Bottleneck的细节如图2所示，左侧的两个Bottleneck A与Bottleneck B分别对应图1的A与B，右侧的为原始的ResNet残差结构。

DetNet与ResNet两者的基本思想都是卷积堆叠层与恒等映射的相加，区别在于DetNet使用了空洞数为2的3×3卷积，这样使得特征图尺寸保持不变，而ResNet是使用了步长为2的3×3卷积。B相比于A，在恒等映射部分增加了一个1×1卷积，这样做可以区分开不同的Stage，并且实验发现这种做法对于特征金字塔式的检测非常重要。

图2

代码：

import torch
from torch import nn
import torch.nn.functional as F 

class DetBottleneck(nn.Module):
    #extra 为False 时 A， True 时 为B
    def __init__(self,inplanes,plances,stride=1,extra=False):
        super(DetBottleneck,self).__init__()
        # 构建三个连续卷积层的Bottleneck
        self.bottlencek = nn.Sequential(
            nn.Conv2d(inplanes,plances,1,bias=False),
            nn.BatchNorm2d(plances),
            nn.ReLU(inplace=True),
            nn.Conv2d(plances,plances,kernel_size=3,stride=1,padding=2,dilation=2,bias=False),
            nn.BatchNorm2d(plances),
            nn.ReLU(inplace=True),
            nn.Conv2d(plances,plances,1,bias=False),
            nn.BatchNorm2d(plances), 
        )
        self.relu = nn.ReLU(inplace=True)
        self.extra = extra
        #Bottleneck B 
        if self.extra:
            self.extra_conv = nn.Sequential(
                nn.Conv2d(inplanes,plances,1,bias=False),
                nn.BatchNorm2d(plances)
            )
    def forward(self,x):
        if self.extra:
            identity = self.extra_conv(x)
        else:
            identity = x
        out = self.bottlencek(x)
        out += identity
        out = self.relu(out)
        return out

if __name__ == "__main__":
    #完成一个Stage 5 B-A-A,stage
    neck_b = DetBottleneck(1024,256,1,True)
    print(neck_b)
    neck_a = DetBottleneck(256,256)
    print(neck_a)
    neck_a2 = DetBottleneck(256,256)
    print(neck_a2)
    inputs = torch.randn(1,1024,14,14)
    output1 = neck_b(inputs)
    output2 = neck_a(output1)
    output3 = neck_a2(output2)
    print(output1.size())
    print(output2.size())
    print(output3.size())

结果：

DetBottleneck(
(bottlencek): Sequential(
(0): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): ReLU(inplace=True)
(6): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(7): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(relu): ReLU(inplace=True)
(extra_conv): Sequential(
(0): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
DetBottleneck(
(bottlencek): Sequential(
(0): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): ReLU(inplace=True)
(6): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(7): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(relu): ReLU(inplace=True)
)
DetBottleneck(
(bottlencek): Sequential(
(0): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False)
(4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): ReLU(inplace=True)
(6): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(7): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(relu): ReLU(inplace=True)
)
torch.Size([1, 256, 14, 14])
torch.Size([1, 256, 14, 14])
torch.Size([1, 256, 14, 14])