TF学习之DeepLabv3+代码阅读5（model）

最新推荐文章于 2024-08-15 09:22:10 发布

lscelory

最新推荐文章于 2024-08-15 09:22:10 发布

阅读量738

点赞数

分类专栏：代码分析文章标签： TensorFlow

本文链接：https://blog.csdn.net/lscelory/article/details/97932699

版权

本文详细解读了TensorFlow实现的DeepLabv3+模型的代码，主要关注`multi_scale_logits`和`get_extra_layer_scopes`两个关键部分，深入理解模型的构建与多尺度特征提取过程。

摘要由CSDN通过智能技术生成

DeepLabv3+代码阅读之model.py

一、multi_scale_logits

多尺度输入，得到输出概率
参数：
	images: 输出图片，尺寸 [batch, height, width, channels].
	model_options: 配置模型的参数选择，一个ModelOptions实例
	image_pyramid: 图片金字塔，输入图片的多个尺度，如果没有此项输入，则为[1.0]，即不进行多尺度的特征提取
	weight_decay: 权值衰减， 对于MobileNet-V2和Xcpetion，取0.00004，对于ResNet，取0.0001
	is_training: 是否是训练过程
	fine_tune_batch_norm: 是否Fine-tune batch norm的参数
	nas_training_hyper_parameters: 储存训练nas模型超参数的字典，包括：
		- `drop_path_keep_prob`: Probability to keep each path in the cell when training.
		- `total_training_steps`: Total training steps to help drop path probability calculation.
返回：
	outputs_to_scales_to_logits: 输出的logits特征，如果是多尺度的输入，则输出对应多个key分别对应各自的输出。
  	例如，如果`scales` = [1.0, 1.5], 则对应输出的keys包括'merged_logits', 'logits_1.00'和'logits_1.50'.

def multi_scale_logits(images,
                       model_options,
                       image_pyramid,
                       weight_decay=0.0001,
                       is_training=False,
                       fine_tune_batch_norm=False,
                       nas_training_hyper_parameters=None):
  """Gets the logits for multi-scale inputs.

  The returned logits are all downsampled (due to max-pooling layers)
  for both training and evaluation.

  Args:
    images: A tensor of size [batch, height, width, channels].
    model_options: A ModelOptions instance to configure models.
    image_pyramid: Input image scales for multi-scale feature extraction.
    weight_decay: The weight decay for model variables.
    is_training: Is training or not.
    fine_tune_batch_norm: Fine-tune the batch norm parameters or not.
    nas_training_hyper_parameters: A dictionary storing hyper-parameters for
      training nas models. Its keys are:
      - `drop_path_keep_prob`: Probability to keep each path in the cell when
        training.
      - `total_training_steps`: Total training steps to help drop path
        probability calculation.

  Returns:
    outputs_to_scales_to_logits: A map of maps from output_type (e.g.,
      semantic prediction) to a dictionary of multi-scale logits names to
      logits. For each output_type, the dictionary has keys which
      correspond to the scales and values which correspond to the logits.
      For example, if `scales` equals [1.0, 1.5], then the keys would
      include 'merged_logits', 'logits_1.00' and 'logits_1.50'.

  Raises:
    ValueError: If model_options doesn't specify crop_size and its
      add_image_level_feature = True, since add_image_level_feature requires
      crop_size information.
  """
  # Setup default values.
  if not image_pyramid:
    image_pyramid = [1.0]
  crop_height =