SSD的图片预处理

最新推荐文章于 2024-02-06 00:34:54 发布

Daniel2333

最新推荐文章于 2024-02-06 00:34:54 发布

阅读量6.8k

点赞数 1

分类专栏： SSD Detection

本文链接：https://blog.csdn.net/weixin_35653315/article/details/72353346

版权

Detection 同时被 2 个专栏收录

8 篇文章 0 订阅

订阅专栏

SSD

3 篇文章 0 订阅

订阅专栏

SSD为了增加数据量，使用了很多图片预处理技术。下面根据ssd_vgg_preprocessing.py看看SSD采用了哪些预处理方法。

训练时的预处理

Random Crop

dst_image, labels, bboxes, distort_bbox = \
            distorted_bounding_box_crop(image, labels, bboxes,
                                        min_object_covered=MIN_OBJECT_COVERED,
                                        aspect_ratio_range=CROP_RATIO_RANGE)

通过distorted_bounding_box_crop方法完成以下功能：

random crop

bbox_begin, bbox_size, distort_bbox =  tf.image.sample_distorted_bounding_box(
         tf.shape(image),
         bounding_boxes=tf.expand_dims(bboxes, 0),
         min_object_covered=min_object_covered,
         aspect_ratio_range=aspect_ratio_range,
         area_range=area_range,
         max_attempts=max_attempts,
         use_image_if_no_bounding_boxes=True)


 # Crop the image to the specified bounding box.
 cropped_image = tf.slice(image, bbox_begin, bbox_size)
 # Restore the shape since the dynamic slice loses 3rd dimension.
 cropped_image.set_shape([None, None, 3])

tf.image.sample_distorted_bounding_box返回的3个tensor表示一个bbox：前两个分别是它的左上角坐标和宽高，可直接用于裁剪原图；最后一个用坐标表示这个bbox, shape=[1, 1, 4]，在这里用于当作reference来调整原图的bbox。

resize bbox

distort_bbox = distort_bbox[0, 0]
bboxes = tfe.bboxes_resize(distort_bbox, bboxes)

tf_extended/bbox.py

 with tf.name_scope(name, 'bboxes_resize'):
     # Translate.
     v = tf.stack([bbox_ref[0], bbox_ref[1], bbox_ref[0], bbox_ref[1]])
     bboxes = bboxes - v
     # Scale.
     s = tf.stack([bbox_ref[2] - bbox_ref[0],
                   bbox_ref[3] - bbox_ref[1],
                   bbox_ref[2] - bbox_ref[0],
                   bbox_ref[3] - bbox_ref[1]])
     bboxes = bboxes / s

过滤bbox
对原图进行random crop操作后，一些bbox会被截掉一部分甚至完全截掉，只有留在crop后图片上的比例大于一定阈值才会被留下来。这个阈值也是个超参数。

最后要将crop得到的图片resize到目标大小：通过双线性插值将图片resize到512*512（如果是SSD512）。

dst_image = tf_image.resize_image(dst_image, out_shape,                                     method=tf.image.ResizeMethod.BILINEAR,                                          align_corners=False)

左右翻转

dst_image, bboxes = tf_image.random_flip_left_right(dst_image, bboxes)

随机颜色扰动

        # Randomly distort the colors. There are 4 ways to do it.
        dst_image = apply_with_random_selector(
                dst_image,
                lambda x, ordering: distort_color(x, ordering, fast_mode),
                num_cases=4)

白化操作

image = tf_image_whitened(image, [_R_MEAN, _G_MEAN, _B_MEAN])

其实就是减去各个channel的均值。

Evaluation与Test时的预处理

只有白化与resize操作。

Daniel2333

关注

1
点赞
踩
19

收藏

觉得还不错? 一键收藏
2
评论
SSD的图片预处理

SSD为了增加数据量，使用了很多图片预处理技术。下面根据ssd_vgg_preprocessing.py看看SSD采用了哪些预处理方法。训练时的预处理Random Cropdst_image, labels, bboxes, distort_bbox = \ distorted_bounding_box_crop(image, labels, bboxes,
复制链接

扫一扫