transpose，pad，space_to_depth(函数总结)

最新推荐文章于 2023-12-30 09:13:50 发布

LuckyFucky

最新推荐文章于 2023-12-30 09:13:50 发布

阅读量793

点赞数 3

文章标签： python tensorflow

本文链接：https://blog.csdn.net/qq_43521665/article/details/115258598

版权

最近在看yolo系列的代码，主要整理一下经常会用到的函数。
①

tran_x = tf.transpose(x, [0, 3, 1, 2])  # channle first mode

当时为了理解这个函数花费了好多时间。看源码也很难明白，其实主要要理解perm参数代表着什么。
perm表示的是数组的阶数，也就是Rank,如果是四位数组，那么每层就分别为(0,1,2,3)，以此类推。如果perm为空，其实就是默认为完全转置，就是变成了（3,2,1,0)，也就是数组的相应位置的x,y,z,w要进行交换。例如下面的x形状为(2,2,3)(x,y,z),5的索引是(0,1,1)，如果perm=[1,2,0](y,z,x)，也就是说索引x变成了y值,索引y变成了z值,索引z变成了x值，对应的，5的索引就变成了(1,1,0)。

  >>> x = tf.constant([[[ 1,  2,  3],
  ...                   [ 4,  5,  6]],
  ...                  [[ 7,  8,  9],
  ...                   [10, 11, 12]]])

  As above, simply calling `tf.transpose` will default to `perm=[2,1,0]`.

  To take the transpose of the matrices in dimension-0 (such as when you are
  transposing matrices where 0 is the batch dimesnion), you would set
  `perm=[0,2,1]`.

  >>> tf.transpose(x, perm=[0, 2, 1])
  <tf.Tensor: shape=(2, 3, 2), dtype=int32, numpy=
  array([[[ 1,  4],
          [ 2,  5],
          [ 3,  6]],
          [[ 7, 10],
          [ 8, 11],
          [ 9, 12]]], dtype=int32)>

②

inputs = tf.pad(inputs, paddings=[[0, 0], [1, 0], [1, 0], [0, 0]], mode='CONSTANT')

pad是很常见的一个填充函数，上述代码其实就是要将输入的四维inputs数组的一四维进行不变，二三维进行填充。其中对于图像处理而言，一维表示batch_size，四维表示深度，所以不用改变，主要是对特征图进行上下和左右填充，看下面的例子。
[2,0]先进行上下填充，也就是在上填充2行，[1,1]再进行左右填充，各添加一列。

t = tf.constant([[1, 2], [4, 5],[8,9]])
paddings = tf.constant([[2, 0], [1, 1]])
t=tf.pad(t, paddings, "CONSTANT")
print(t)
#结果
tf.Tensor(
[[0 0 0 0]
 [0 0 0 0]
 [0 1 2 0]
 [0 4 5 0]
 [0 8 9 0]], shape=(5, 4), dtype=int32)

③

return tf.nn.space_to_depth(x, block_size=stride)

通俗点说，就是变小变长。
space_to_depth表示将长和宽上的维度叠加到深度上，相当于池化，但池化是在所有size中选一个，但是该方法是将size中取一个，剩下的叠加到深度方向，可以保留低纬度的特征，block_size即是池化中的块大小。也就是下图的意思，下图示例的block_size=2。
在这里插入图片描述
图片来源：图解TensorFlow op：tf.nn.space_to_depth，想了解更多的可以看这篇文章，非常详细。