检测算法SSD中的数据增强分析

最新推荐文章于 2023-02-10 11:19:01 发布

置顶

deepindeed

最新推荐文章于 2023-02-10 11:19:01 发布

阅读量796

点赞数 1

分类专栏：【计算机视觉】【论文笔记】文章标签：深度学习计算机视觉

本文为博主原创文章，未经博主允许不得转载。

本文链接：https://blog.csdn.net/cwlseu/article/details/103654529

版权

本文深入探讨SSD（Single Shot MultiBox Detector）中的数据增强技术，通过在VOC2007 Dataset上的实验，展示数据增强如何提升模型性能，达到77.2%的mAP。文章详细解析了模型的关键组件，包括非极大值抑制步骤，以及在训练过程中对图片进行Expand和Distort等操作，同时阐述了数据增强的参数设置和采样过程。

摘要由CSDN通过智能技术生成

SSD(Single Shot MultiBox Detector)

本文是Wei Liu在2016年的一篇成果. 采用VOC2007 Dataset, Nvidia Titan X上：

mAP： 74.3%
59FPS
使用数据增强技术可以达到77.2%

模型关键词

使用前向反馈CNN网络，产生固定数目的bounding box，然后再这些bounding box中进行打分。

non-maximum suppression step
Non-Maximum Suppression就是根据score和box的坐标信息，从中找到置信度比较高的bounding box。

首先，根据score进行排序，把score最大的bounding box拿出来。
计算其余bounding box与这个box的IoU，然后去除IoU大于设定的阈值的bounding box。
重复上面的过程，直至候选bounding box为空。
说白了就是要在一堆矩阵里面找出一些局部最大值，所以要把和这些局部最大值所代表矩阵IoU比较大的去除掉，这样就能得到一些权值很大，而且IoU又比较小的bounding box。

源代码分析

anno_type_

has_anno_type_ = anno_datum.has_type() || anno_data_param.has_anno_type(); 最后的结果是什么？其中anno_data_param.has_anno_type() 结果是false, 关键看anno_datum中有没有了。这个里面有没有要去看你运行create_data.sh的时候对数据进行了什么操作。在本文中是对其中写了AnnotatedDatum_AnnotationType_BBOX类型

    ······
    else if (anno_type == "detection") {
   
      // 数据转化过程中写入的类型
      labelname = root_folder + boost::get<std::string>(lines[line_id].second);
      status = ReadRichImageToAnnotatedDatum(filename, labelname, resize_height,
          resize_width, min_dim, max_dim, is_color, enc, type, label_type,
          name_to_label, &anno_datum);
      anno_datum.set_type(AnnotatedDatum_AnnotationType_BBOX);
    }
    ······

因此此处has_anno_type就是true。

过程概述

for i in batch_size:

先对图片Expand操作或者Distort操作进行处理

- 首先从数据队里中获取img，但是不进行删除。接着对这个image进行拓展\distort操作
- Expand过程中是在随机生成一个拓增后的大小图片expand_img
- 采用平均值填充图片
- 将图片向拓增后的图片进行copy
- 该操作结束之后，就生成了expand_img，后面在expand_img基础上进行操作

生成采样

- 入口`GenerateBatchSamples(*expand_datum, batch_samplers_, &sampled_bboxes);`
- 其中需要注意的是生成了若干个sampled_bboxes,但是只是从中随机选择了一个进行裁剪;否则直接使用原来的expand后的的数据

对sampled_datum进行resize

- 调用Data_transformer进行转化
- 其中包括对AnnotationData中的Datum进行转化
- Annotation的转化
  + 其中包括resize和重新映射等操作
  + 需要重新映射标注中Bounding Box的大小
  + expaned_image中的annotation进行annotation转化之后返回数据类型

vector<AnnotationGroup>
    |-- AnnotationGroup
           |-- group_label
           |-- Annotation多个
                  |-- bbox
                  |-- instance_id

对采样后的数据重新编码到blob中
将处理后的数据push到batch数据流中
reader_.free().push(const_cast<AnnotatedDatum*>(&anno_datum));
endfor
重新处理annotation数据
最后的top_label数据shape为：1 x 1 x num_boxs x 8

数据增强入口

# Create train net.
# NOTE: Where the data from
net = caffe.NetSpec()
net.data, net.label = CreateAnnotatedDataLayer(train_data, batch_size=batch_size_per_device,train=True, output_label=True, label_map_file=label_map_file,transform_param=train_transform_param, batch_sampler=batch_sampler)
def CreateAnnotatedDataLayer(source, batch_size=32, backend=P.Data.LMDB,
        output_label=True, train=True, label_map_file='', anno_type=None,
        transform_param={
   }, batch_sampler=[{
   }]):
    if train:
        kwargs = {
   
                'include': dict(phase=caffe_pb2.Phase.Value('TRAIN')),
                'transform_param': transform_param,
                }
    else:
        kwargs = {
   
                'include': dict(phase=caffe_pb2.Phase.Value('TEST')),
                'transform_param': transform_param,
                }
    ntop = 1
    if output_label:
        ntop = 2
    annotated_data_param = {
   
        'label_map_file': label_map_file,
        'batch_sampler': batch_sampler,
        }
    if anno_type is not None:
        annotated_data_param.update({
   'anno_type': anno_type})
    return L.AnnotatedData(name="data", annotated_data_param=annotated_data_param,
        data_param=dict(batch_size=batch_size, backend=backend, source=source),
        ntop=ntop, **kwargs)

参数说明

一个sampler的参数说明

// Sample a bbox in the normalized space [0, 1] with provided constraints.
message Sampler {
// 最大最小scale数
optional float min_scale = 1 [default = 1.];
optional float max_scale = 2 [default = 1.];
// 最大最小采样长宽比，真实的长宽比在这两个数中间取值
optional float min_aspect_ratio = 3 [default = 1.];
optional float max_aspect_ratio = 4 [default = 1.];
}

对于选择的sample_box的限制条件

// Constraints for selecting sampled bbox.
message SampleConstraint {
  // Minimum Jaccard overlap between sampled bbox and all bboxes in
  // AnnotationGroup.
  optional float min_jaccard_overlap = 1;
  // Maximum Jaccard overlap between sampled bbox and all bboxes in
  // AnnotationGroup.
  optional float max_jaccard_overlap = 2;
  // Minimum coverage of sampled bbox by all bboxes in AnnotationGroup.
  optional float min_sample_coverage = 3;
  // Maximum coverage of sampled bbox by all bboxes in AnnotationGroup.
  optional float max_sample_coverage = 4;
  // Minimum coverage of all bboxes in AnnotationGroup by sampled bbox.
  optional float min_object_coverage = 5;
  // Maximum coverage of all bboxes in AnnotationGroup by sampled bbox.
  optional float max_object_coverage = 6;
}

我们们往往只用max_jaccard_overlap

对于一个batch进行采样的参数设置

// Sample a batch of bboxes with provided constraints.
message BatchSampler {
  // 是否使用原来的图片
  optional bool use_original_image = 1 [default = true];
  // sampler的参数
  optional Sampler sampler = 2;
  // 对于采样box的限制条件，决定一个采样数据positive or negative
  optional SampleConstraint sample_constraint = 3;
  // 当采样总数满足条件时，直接结束
  optional uint32 max_sample = 4;
  // 为了避免死循环，采样最大try的次数.
  optional uint32 max_trials = 5 [default = 100];
}

转存datalayer数据的参数

message TransformationParameter {
  // 对于数据预处理，我们可以仅仅进行scaling和减掉预先提供的平均值。
  // 需要注意的是在scaling之前要先减掉平均值
  optional float scale = 1 [default = 1];
  // 是否随机镜像操作
  optional bool mirror = 2 [default = false];
  // 是否随机crop操作
  optional uint32 crop_size = 3 [default = 0];
  optional uint32 crop_h = 11 [default = 0];
  optional uint32 crop_w = 12 [default = 0];
  // 提供mean_file的路径，但是不能和mean_value同时提供
  // if specified can be repeated once (would substract it from all the 
  // channels) or can be repeated the same number of times as channels
  // (would subtract them from the corresponding channel)
  optional string mean_file = 4;
  repeated float mean_value = 5;
  // Force the decoded image to have 3 color channels.
  optional bool force_color = 6 [default = false];
  // Force the decoded image to have 1 color channels.
  optional bool force_gray = 7 [default = false];
  // Resize policy
  optional ResizeParameter resize_param = 8;
  // Noise policy
  optional NoiseParameter noise_param = 9;
  // Distortion policy
  optional DistortionParameter distort_param = 13;
  // Expand policy
  optional ExpansionParameter expand_param = 14;
  // Constraint for emitting the annotation after transformation.
  optional EmitConstraint emit_constraint = 10;
}

SSD中的数据转换和采样参数设置

# sample data parameter
batch_sampler = [
    {
   
      # use_original_image : true,
        'sampler': {
   
        },
        'max_trials': 1,
        'max_sample': 1,
    },
    {
   
        'sampler': {
   
            'min_scale': 0.3,
            'max_scale': 1.0,
            'min_aspect_ratio': 0.5,
            'max_aspect_ratio': 2.0,
        },
        'sample_constraint': {
   
            'min_jaccard_overlap': 0.1,
        },
        'max_trials': 50,
        'max_sample': 1,
    },
    ...

最低0.47元/天解锁文章

deepindeed

关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
1
评论
检测算法SSD中的数据增强分析

layout: posttitle: “深度学习：检测算法SSD中的数据增强分析”categories: [project]tags: [深度学习, detection, CV算法]description: “数据增强技术在CV研究中对于提高Performance是重要的研究话题。尤其是在物体检测方面，业界流行的方法中对具体方法之外，往往通过数据增强技术再次提高几个百分点。”SSD(...
复制链接

扫一扫