object detection训练参数——image_resizer,输入图像尺寸的修改配置

最新推荐文章于 2024-08-30 07:18:27 发布

云深安小生

最新推荐文章于 2024-08-30 07:18:27 发布

阅读量2.1k

点赞数 2

分类专栏： Tensorflow object detection api 文章标签：深度学习

本文链接：https://blog.csdn.net/l13022736018/article/details/108619875

版权

Tensorflow object detection api 专栏收录该内容

3 篇文章 4 订阅

订阅专栏

object detection API训练参数适应自建数据集之调整image_resizer{}输入图像尺寸大小及问题解决

背景

研究项目中需要对SSD_MobilNet_v2模型训练的自己的数据集用来做目标检测，方便后期将实时检测模型迁移到Android手机中，前期的几次训练训练效果一直不好，精度和损失结果都很差，想到可能是原始采集的图像分辨率比较高，3680* 2760，但ssd-mobilenet模型中输入图像的分辨率统一被修改为300* 300，配置文件中默认设定image_resizer{}参数为fixed_shape_resizer,300*300，如下所示

image_resizer {
  fixed_shape_resizer {
	heidht: 300
	width: 300
  }
}

所以无论输入的图像尺寸是多少都会被resize成300* 300,高分辨率图像在此过程中信息丢失严重，为了改变这种傻瓜式的输入模式，因此想要修改image_resiezer{ }的参数。以下将这一问题的实现过程写在这里，一方面提高对object detection模型参数的理解，另一方面作为学习笔记，方便以后查看。

问题

在训练过程中原始数据纵横比由于比较大，1200300，因此想要修改模型输入尺寸，将原数据1200300作为训练输入而不是将输入后再被resize成300*300。刚开始查看修改image_resizer的源码，希望找到一些解决办法，参考源码总结如下：
object detection文件夹下的子文件夹proto是模型参数可配置数值声明文件的合集，找到image_resizer_pb.py和image_resizer.proto文件，image_resizer_pb.py是由后者编译生成的，编译生成过程在前期完成，这里不再详述，读者请自行百度，goodle解决。在image_resizer.proto中声明了image_resizer的四个参数，KeepAspectRatioResizer keep_aspect_ratio_resizer，FixedShapeResizer fixed_shape_resizer，IdentityResizer identity_resizer，ConditionalShapeResizer conditional_shape_resizer。

message ImageResizer {
  oneof image_resizer_oneof {
	KeepAspectRatioResizer keep_aspect_ratio_resizer = 1;
	FixedShapeResizer fixed_shape_resizer = 2;
	IdentityResizer identity_resizer = 3;
	ConditionalShapeResizer conditional_shape_resizer = 4;
  }
}

KeepAspectRatioResizer

keep_aspect_ratio_resizer保持纵横比输入，需要声明min_diemion 和max_diemion，文件中该参数分别默认设置为600，1024。

image_resizer {
  keep_aspect_ratio_resizer {
    min_dimension: 600
    max_dimension: 1024
  }
}`

随便输入一个图片，那么它会调整最小的维数为600，最大的维数是1024。如果你输入为100100，它会调整到600600，输入为20002000，会调整为10241024，输入为8001000会调整到600750，输入的图像为1200900的像素的图像，经过keep_aspect_astio_resizer后的图像尺寸变为1024768。既保证输入图片的纵横比不变。

FixedShapeResizer

fixde_shape_resizer固定图像尺寸，一旦设定height和width参数，无论输入图片的原始尺寸是多少，在输入网络训练前都会被resizer处理成（height, width, channels)的输入尺寸。有时可根据自己原始尺寸大小调节image_resizer的参数，来适应自己的图像输入。我的图像尺寸是1200*300大小的，在配置文件中直接修改height和width为1200，300。

IdentityResizer

在image_resizer.proto文件中并没有具体的使用和参数配置描述，暂时不在考虑使用的范围。

ConditionalShapeResizer

conditional_shape_resizer需要设定两个参数，condition和size_threashold，condition可设为’GREATER’或者’SMALLER’。size_thresghold默认设置为300，也可根据自己图像数据集的尺寸设定。当condition设为’GREATER’，如果图像的尺寸大于300则会对图像尺寸resize处理至300，并且保持图像的纵横比不变；同理当condition设为‘SMALLER’时，小于300的图像也会被resize为300，保持图像的纵横比不变。

报错

在参考源码后，尝试两种修改image_resizer{}的参数配置：

keep_aspect_ratio_resizer

直接修改min_dimension和max_dimension的数值为我的原始数据的图像尺寸

image_resizer {
  keep_aspect_ratio_resizer {
    min_dimension: 300
    max_dimension: 1200
  }
}`

报错如下：

tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument: Cannot add tensor to the batch: number of elements does not match. Shapes are: [tensor]: [338,300,3], [batch]: [300,1200,3]
	 [[node IteratorGetNext (defined at /home/lzy/anaconda3/envs/ssdMobileNet/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
	 [[IteratorGetNext/_5635]]
  (1) Invalid argument: Cannot add tensor to the batch: number of elements does not match. Shapes are: [tensor]: [338,300,3], [batch]: [300,1200,3]
	 [[node IteratorGetNext (defined at /home/lzy/anaconda3/envs/ssdMobileNet/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]

一开始搞不清楚是怎么回事，百度和google了一下，也没发现好的解决办法，于是尝试了另外下面一种方式，来设置image_resizer的参数

fixed_shape_resizer

image_resizer {
  fixed_shape_resizer {
	heidht: 300
	width: 1200
  }
}

报错如下：

tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
  (0) Resource exhausted: OOM when allocating tensor with shape[64,96,300,450] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[{{node FeatureExtractor/MobilenetV2/expanded_conv_1/expand/BatchNorm/FusedBatchNormV3}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	 [[control_dependency/_9527]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

  (1) Resource exhausted: OOM when allocating tensor with shape[64,96,300,450] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[{{node FeatureExtractor/MobilenetV2/expanded_conv_1/expand/BatchNorm/FusedBatchNormV3}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

仍然报错，略思考了一下还是不知道该怎么板办，百度后发现可能是batch_size参数设的过大导致GPU运算能力不足导致的，由于ssd模型的默认输入是300300的，总像素数是90000，batch_size设为24，但是当输入尺寸修改为1200300一张图像的总像素数扩大了四倍，GPU的计算量大于4倍，当运行不动时就会出现这样的错误，因此调小batch_size的大小就能顺利解决这个问题。回过头来再尝采用降低batch_size来解决采用keep_aspect-ration_resizer参数出现的报错，顺利解决。
在解决这个问题的过程中发现采用keep_aspect_ratio_resizer比采用fixed_shape_resizer的方式的运算量更大。