【竞赛总结】高光谱目标检测冠军方案

Deepsdu

已于 2023-02-08 13:15:24 修改

阅读量4.1k

点赞数 1

分类专栏： Challenge DeepLearning 文章标签： deep learning 深度学习 python

于 2022-03-19 13:39:58 首次发布

本文链接：https://blog.csdn.net/weixin_42200352/article/details/123590301

版权

DeepLearning 同时被 2 个专栏收录

9 篇文章 4 订阅

订阅专栏

Challenge

4 篇文章 0 订阅

订阅专栏

文章目录

高光谱目标检测冠军方案

高光谱目标检测冠军方案

还在寒假的时候，和师兄一起参加了一个比赛，高光谱半监督目标检测挑战，Semi-Supervised Hyperspectral Object Detection Challenge (SSHODC)。比赛的链接在下方。经过一个多月的努力，在15号成功取得比赛的第一名。同时组内另外两个同门参加了两个赛道，也取得了两个赛道的第一名，这样的话，组内就同时有三个赛道的第一名。附上比赛链接

https://pbvs-workshop.github.io/challenge.html

在这里插入图片描述

比赛是2月15号左右发现的，然后师兄拉上我，直接就着手参与了比赛。比赛官方提供了baseline，上手很快，使用的框架是MMDetection，很快就配置好了环境，提交了几版结果。

一、数据分析：

数据是多光谱进行下采样，具有51个通道，这对可视化就带来了难度，通常情况下，可视化的方法要么是对一通道，要么是对三通道，这种多通道的是第一次见。训练集有989张图片，其中标注了102张，其余都是无标签，验证集605张，测试集1296张。比赛分为两个阶段，验证和测试阶段。我们的打算是在3月份之前完成对有标签数据的训练，之后进行半监督伪标签的训练。
首先，对数据集进行了查看，统计了宽高比，训练集是188/189 : 1600，验证集是182/183 : 1600，测试集 155/156，171/172的长边非常的固定，短边有较大的差距。标注的类别也是非常的不平衡，1有411， 2有20， 3有14张，长尾分布。
值得注意的是，比赛提出了对模型backbone计算量和参数量的限制。

在这里插入图片描述

二、方案

Cascade-RCNN，RPN，MobileNetV2，Soft-NMS
数据增强,由于数据比较少，数据增强非常的关键。这点数据增强提升了至少15个点。但是验证集和测试集分布变化比较大，验证集的分数是测试集的两倍。

albu_train_transforms = [
    # dict(
    #     type='ShiftScaleRotate',
    #     p=0.5),
    dict(
        type='ShiftScaleRotate',
        shift_limit=0.0625,
        scale_limit=0.0,
        rotate_limit=0,
        interpolation=1,
        p=0.5),
    dict(
        type='RandomBrightnessContrast',
        brightness_limit=[0.1, 0.3],
        contrast_limit=[0.1, 0.3],
        p=0.5),
    dict(
        type='RandomResizedCrop',
        height=188,
        width=1600,
        p=0.2),
    dict(
        type='Sharpen',
        p=0.5),
    dict(
        type='Perspective',
        p=0.2),
    dict(
        type='SafeRotate',
        p=0.2),
    dict(type='RandomRotate90', p=0.2),
    dict(type='CenterCrop', height=50, width=50, p=0.2),
     dict(
        type='OneOf',
        transforms=[
            dict(type='Blur', blur_limit=3, p=1),
            dict(type='MedianBlur', blur_limit=3, p=1),
            dict(type='GaussianBlur', blur_limit=3, p=1),
 
train_pipeline = [
    dict(type='LoadMaskedHSIImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    # dict(type='Resize', img_scale=[(1600, 250), (1600, 350), (1600, 150)], keep_ratio=False, multiscale_mode='value'),  # 多尺度训练使用在 img_scale=[(1333, 640), (1333, 800), (600，1080), (1200, 1000)， (416,700)],
    dict(type='Resize', img_scale=[(1600, 188), (1600, 189)], keep_ratio=True),  # 多尺度训练使用在 img_scale=[(1333, 640), (1333, 800), (600，1080), (1200, 1000)， (416,700)],
    dict(type='RandomFlip', flip_ratio=0.5),
    # 1 dict(type='RandomCrop', crop_size=(180, 180)),
    # dict(type='Shear', level=1),
    # 3 dict(type='CutOut', n_holes=1, cutout_shape=[(50, 50), (100, 100)], cutout_ratio=(0.5)),
    # dict(type='Rotate', level=2),
    # dict(type='Translate', level=2, direction='horizontal'),
    # dict(type='Translate', level=2, direction='vertical'),
    dict(type='BrightnessTransform', level=3),

    dict(
        type='Albu',
        transforms=albu_train_transforms,
        bbox_params=dict(
            type='BboxParams',
            format='pascal_voc',
            label_fields=['gt_labels'],
            min_visibility=0.0,
            filter_lost_elements=True),
        keymap={
            'img': 'image',
            'gt_masks': 'masks',
            'gt_bboxes': 'bboxes'
        },
        update_pad_shape=False,
        skip_img_without_anno=True),

    dict(type='Pad', size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]

test_pipeline = [
    dict(type='LoadMaskedHSIImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=[(1600, 188), (1600, 189)],
        flip=False,
        transforms=[
            # dict(type='Translate', level=2, direction='horizontal'),
            # dict(type='Translate', level=2, direction='vertical'),
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(type='Pad', size_divisor=32),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img']),
        ])
]