图解TensorFlow op：tf.nn.space_to_depth

最新推荐文章于 2023-12-30 09:13:50 发布

thl789

最新推荐文章于 2023-12-30 09:13:50 发布

阅读量5.4k

点赞数 16

分类专栏： ai tf.tensor tf.ops 文章标签： TensorFlow 人工智能 space_to_depth Reshape NHWC

本文链接：https://blog.csdn.net/thl789/article/details/109189889

版权

ai 同时被 3 个专栏收录

23 篇文章 4 订阅

订阅专栏

tf.tensor

8 篇文章 1 订阅

订阅专栏

tf.ops

6 篇文章 0 订阅

订阅专栏

田海立@CSDN 2020-10-20

TensorFlow算子space_to_depth是depth_to_space的逆操作。本文用图文的方式来解释该算子运算的方式。

一、space_to_depth原型

space_to_depth是把space数据（width和height维）移到depth（Channel）维上，与depth_to_space刚好是反向的操作。对应到ML该操作是把width和height维上各取block_size都分给depth上。所以，对应有一个参数block_size，要求原Tensor的height和width都是block_size的整数倍。

这样，新的Tensor

Width是 input_width / block_size；
Height是 input_height / block_size；
Depth是input_depth * block_size * block_size

原型如下：

tf.nn.space_to_depth(
    input, block_size, data_format='NHWC', name=None
)

因为这里严格区分了C与H/W各维度，如果数据格式不是NHWC时，需要指定data_format。

二、space_to_depth程序实现

以[1, 6, 4, 3] space_to_depth(block_size = 2)为例：

>>> 
>>> t = tf.range(4*6*3)
>>> t = tf.reshape(t, [1, 4, 6, 3])
>>> t
<tf.Tensor: shape=(1, 4, 6, 3), dtype=int32, numpy=
array([[[[ 0,  1,  2],
         [ 3,  4,  5],
         [ 6,  7,  8],
         [ 9, 10, 11],
         [12, 13, 14],
         [15, 16, 17]],

        [[18, 19, 20],
         [21, 22, 23],
         [24, 25, 26],
         [27, 28, 29],
         [30, 31, 32],
         [33, 34, 35]],

        [[36, 37, 38],
         [39, 40, 41],
         [42, 43, 44],
         [45, 46, 47],
         [48, 49, 50],
         [51, 52, 53]],

        [[54, 55, 56],
         [57, 58, 59],
         [60, 61, 62],
         [63, 64, 65],
         [66, 67, 68],
         [69, 70, 71]]]], dtype=int32)>
>>>

执行space_to_depth(block_size = 2)之后：

>>> 
>>> t = tf.nn.space_to_depth(t, 2)
>>> t
<tf.Tensor: shape=(1, 2, 3, 12), dtype=int32, numpy=
array([[[[ 0,  1,  2,  3,  4,  5, 18, 19, 20, 21, 22, 23],
         [ 6,  7,  8,  9, 10, 11, 24, 25, 26, 27, 28, 29],
         [12, 13, 14, 15, 16, 17, 30, 31, 32, 33, 34, 35]],

        [[36, 37, 38, 39, 40, 41, 54, 55, 56, 57, 58, 59],
         [42, 43, 44, 45, 46, 47, 60, 61, 62, 63, 64, 65],
         [48, 49, 50, 51, 52, 53, 66, 67, 68, 69, 70, 71]]]], dtype=int32)>
>>>

三、space_to_depth对数据的处理

space_to_depth操作对数据的处理就是：

选取[in_batch, block_size, block_size, in_depth]为一个Tensor；
对Step#1里的Tensor做reshape操作改变为[in_batch, 1, 1, in_channel * block_size * block_size]；
按照先Width方向再Height方向的顺序同Step#1一样选择Tensor；
对Step#3里的Tensor同Step#3一样做reshape操作；
把Step#4和Step#4里得到的新Tensor拼接起来。

最后的Tensor的shape也就是[in_batch, in_height / block_size, in_width / block_size, in_channel * (block_size*block_size)]

上述的处理过程，一张图展示就是这样：