deeplab 习题

最新推荐文章于 2023-02-04 08:00:00 发布

northeastsqure

最新推荐文章于 2023-02-04 08:00:00 发布

阅读量291

点赞数

文章标签：习题

本文链接：https://blog.csdn.net/northeastsqure/article/details/89473603

版权

https://github.com/tensorflow/models/blob/master/research/deeplab

1. output_stride 是什么意识？

指的是输入图像宽高与输出的featuremap的宽高的比例

代码里：

  logits_height = scale_dimension(
      crop_height,
      max(1.0, max(image_pyramid)) / logits_output_stride)
  logits_width = scale_dimension(
      crop_width,
      max(1.0, max(image_pyramid)) / logits_output_stride)

1.5 那么decoder_output_stride是做什么用的

模型指定output_stride后会输出一个dict， mobilenetv2, output_stride=16, 如下

feature_extractor.networks_to_feature_maps[model_variant][
                  feature_extractor.DECODER_END_POINTS]

{4: ['layer_4/depthwise_output'], 8: ['layer_7/depthwise_output'], 16: ['layer_14/depthwise_output']}

其中4指：该层特征输出为原图的1/4. 所以指定--decoder_output_stride=4, 那么deeplabv3 输出为原图的的1/4, 特征融合了最高层和4: ['layer_4/depthwise_output'], 这里也可以指定多个层合并--decoder_output_stride=16,8,4

1.6 命令行参数里面 atrous rate是做什么的？

一共两处使用。一处在主干网络最后层，使用了，3次空洞卷积；另外一次是在使用底层特征decoder， refine以后得到的特征层，使用空洞卷积，得到最后的mask

2.deeplabv3 论文里面把包含难类的图片做了两份来训练，请问有哪些类？

椅子，桌子，自行车，沙发，盆栽

3.比如segmentation 任务中有21，类，可是有些类形状很小，是否需要做类别均衡？如何做？

4.slim.separable_conv_2d, 里面参数，depth_multiplier,是做什么用的？

separable convolution, 就是先对通道维单独，进行卷积，每个卷积核实不一样的。然后使用1x1和合并多个通道，得到num_outputs个通道，这里的卷积核个数，是num_outputs。如果num_outputs为None,那么后面的合并操作被省略。如果depth_multiplier=2,那么第一次的按通道卷积，卷积核应该是2*通道数。

而mobilenet_v2里面也有一个depth_multiplier, 这两个不是一个意思。mobilenet_v2里面，depth_multiplier是一个(0,1)之间的小数，用来减少通道个数的。

5.pspnet（pyramid scene parsing network）的主要过程是什么？

使用卷积网络抽取特征后，使用不同大小的pool，并行得到不同大小的 featuremap,分别卷积，然后upsample,和抽取的特征concat,最后convolution得到最终的分割图

6.deeplab的训练图片输入为什么是513不是512？

因为想让图片不被裁剪，在pascal voc,上面图片尺寸是512x512, 所以设置大1，就不会被裁剪了

https://github.com/tensorflow/models/issues/3939#issuecomment-380171119

7. aspp 模块介绍

aspp 模块采用不同的atrous rate， output stride = 8 那么atrous rate = [12, 24, 36], 如果output_stride=16, 那么atrous_rate=[6,12,24],

如果指定了squeeze_and_excitaion, 在加上featuremap 的global pool, resize到feature_map的尺寸，也叫image_feature

以及原始feature conv1x1,

最后5个 feature_map 进行concat, conv

不知道为什么最后还乘以了image_feature?

8. 在预处理里面，去掉调色板，为什么这么用？

def _remove_colormap(filename):
  """Removes the color map from the annotation.

  Args:
    filename: Ground truth annotation filename.

  Returns:
    Annotation without color map.
  """
  return np.array(Image.open(filename))

因为PIL image, 默认打开模式'P', 是用调色板打开的，打开的array, 是0~255的数字，和rgb的映射关系是：(pallete[k], p[256+k], p[2*256+k]). 这样就去掉了调色板。https://stackoverflow.com/questions/51702670/tensorflow-deeplab-image-colormap-removal-confusion