在网上看到这篇文章【TensorFlow】tf.nn.max_pool实现池化操作后,一开始没看懂,后来和同学讨论以后,可算是看明白了,所以在这里做一个解释,也加深自己对池化操作的理解
最初的矩阵是这样的,维度是[2, 4, 4]
a = tf.constant([
[[1.0, 2.0, 3.0, 4.0],
[5.0, 6.0, 7.0, 8.0],
[8.0, 7.0, 6.0, 5.0],
[4.0, 3.0, 2.0, 1.0]],
[[4.0, 3.0, 2.0, 1.0],
[8.0, 7.0, 6.0, 5.0],
[1.0, 2.0, 3.0, 4.0],
[5.0, 6.0, 7.0, 8.0]]
])
[ [ 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 ] [ 4 3 2 1 8 7 6 5 1 2 3 4 5 6 7 8 ] ] \left[ \begin{matrix} \left[ \begin{matrix} 1 & 2 & 3 & 4\\ 5 & 6 & 7 & 8\\ 8 & 7 & 6 & 5\\ 4 & 3 & 2 & 1 \end{matrix} \right] \\ \\ \left[ \begin{matrix} 4 & 3 & 2 & 1\\ 8 & 7 & 6 & 5\\ 1 & 2 & 3 & 4\\ 5 & 6 & 7 & 8 \end{matrix} \right] \end{matrix} \right] ⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡⎣⎢⎢⎡1584267337624851⎦⎥⎥⎤⎣⎢⎢⎡4815372626371548⎦⎥⎥⎤⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤
a_new = tf.reshape(a, [1, 4, 4, 2])
经过reshape操作后,变成下面这样,作为输入,矩阵的含义为 (batch_size, height, width, channels),这里的channel数为2,所以有两个通道,红色代表channel 1,蓝色代表channel 2。
[
[
1
2
3
4
5
6
7
8
]
[
8
7
6
5
4
3
2
1
]
[
4
3
2
1
8
7
6
5
]
[
1
2
3
4
5
6
7
8
]
]
\left[ \begin{matrix} \left[ \begin{matrix} \textcolor{red}1 & \textcolor{blue}2 \\ \textcolor{red}3 & \textcolor{blue}4 \\ \textcolor{red}5 & \textcolor{blue}6 \\ \textcolor{red} 7 & \textcolor{blue}8 \end{matrix} \right] \left[ \begin{matrix} \textcolor{red}8 & \textcolor{blue}7 \\ \textcolor{red}6 & \textcolor{blue}5 \\ \textcolor{red}4 & \textcolor{blue}3 \\ \textcolor{red}2 & \textcolor{blue}1 \end{matrix} \right] \left[ \begin{matrix} \textcolor{red} 4 & \textcolor{blue}3 \\ \textcolor{red}2 & \textcolor{blue}1 \\ \textcolor{red} 8 & \textcolor{blue}7 \\ \textcolor{red}6 & \textcolor{blue}5 \end{matrix} \right] \left[ \begin{matrix} \textcolor{red} 1 & \textcolor{blue}2 \\ \textcolor{red} 3 & \textcolor{blue}4 \\ \textcolor{red} 5 & \textcolor{blue}6 \\ \textcolor{red} 7 & \textcolor{blue}8 \end{matrix} \right] \end{matrix} \right]
⎣⎢⎢⎡⎣⎢⎢⎡13572468⎦⎥⎥⎤⎣⎢⎢⎡86427531⎦⎥⎥⎤⎣⎢⎢⎡42863175⎦⎥⎥⎤⎣⎢⎢⎡13572468⎦⎥⎥⎤⎦⎥⎥⎤
pooling=tf.nn.max_pool(a_new,[1,2,2,1],[1,1,1,1],padding='VALID')
池化:
pooling = tf.nn.max_pool(
h,
ksize=[1, height, width, 1],
strides=[1, 1, 1, 1],
padding='VALID',
name="pool")
h : 需要池化的输入,一般池化层接在卷积层后面,所以输入通常是feature map,依然是[batch_size, height, width, channels]这样的shape
k_size : 池化窗口的大小,取一个四维向量,一般是[1, height, width, 1],因为我们不想在batch和channels上做池化,所以这两个维度设为了1
strides : 窗口在每一个维度上滑动的步长,一般也是[1, stride,stride, 1]
padding: 填充的方法,SAME或VALID,SAME表示添加全0填充,VALID表示不添加
这里的k_size 维度是[1, 2, 2, 1],形状如下
[
[
□
□
]
[
□
□
]
]
\left[ \begin{matrix} \left[ \begin{matrix} \Box \\ \Box\\ \end{matrix} \right] \left[ \begin{matrix} \Box \\ \Box\\ \end{matrix} \right] \end{matrix} \right]
[[□□][□□]]
用k_size 分别遍历a_new的两个channel,
[
[
1
2
3
4
5
6
7
8
]
[
8
7
6
5
4
3
2
1
]
[
4
3
2
1
8
7
6
5
]
[
1
2
3
4
5
6
7
8
]
]
\left[ \begin{matrix} \left[ \begin{matrix} \textbf\textcolor{green}1 & \textcolor{blue}2 \\ \textbf\textcolor{green}3 & \textcolor{blue}4 \\ \textcolor{red}5 & \textcolor{blue}6 \\ \textcolor{red} 7 & \textcolor{blue}8 \end{matrix} \right] \left[ \begin{matrix} \textbf \textcolor{green}8 & \textcolor{blue}7 \\ \textbf\textcolor{green}6 & \textcolor{blue}5 \\ \textcolor{red}4 & \textcolor{blue}3 \\ \textcolor{red}2 & \textcolor{blue}1 \end{matrix} \right] \left[ \begin{matrix} \textcolor{red} 4 & \textcolor{blue}3 \\ \textcolor{red}2 & \textcolor{blue}1 \\ \textcolor{red} 8 & \textcolor{blue}7 \\ \textcolor{red}6 & \textcolor{blue}5 \end{matrix} \right] \left[ \begin{matrix} \textcolor{red} 1 & \textcolor{blue}2 \\ \textcolor{red} 3 & \textcolor{blue}4 \\ \textcolor{red} 5 & \textcolor{blue}6 \\ \textcolor{red} 7 & \textcolor{blue}8 \end{matrix} \right] \end{matrix} \right]
⎣⎢⎢⎡⎣⎢⎢⎡13572468⎦⎥⎥⎤⎣⎢⎢⎡86427531⎦⎥⎥⎤⎣⎢⎢⎡42863175⎦⎥⎥⎤⎣⎢⎢⎡13572468⎦⎥⎥⎤⎦⎥⎥⎤
[
[
1
2
3
4
5
6
7
8
]
[
8
7
6
5
4
3
2
1
]
[
4
3
2
1
8
7
6
5
]
[
1
2
3
4
5
6
7
8
]
]
\left[ \begin{matrix} \left[ \begin{matrix} \textcolor{red}1 & \textbf\textcolor{green}2 \\ \textcolor{red}3 & \textbf\textcolor{green}4 \\ \textcolor{red}5 & \textcolor{blue}6 \\ \textcolor{red} 7 & \textcolor{blue}8 \end{matrix} \right] \left[ \begin{matrix} \textcolor{red}8 & \textbf\textcolor{green}7 \\ \textcolor{red}6 & \textbf\textcolor{green}5 \\ \textcolor{red}4 & \textcolor{blue}3 \\ \textcolor{red}2 & \textcolor{blue}1 \end{matrix} \right] \left[ \begin{matrix} \textcolor{red} 4 & \textcolor{blue}3 \\ \textcolor{red}2 & \textcolor{blue}1 \\ \textcolor{red} 8 & \textcolor{blue}7 \\ \textcolor{red}6 & \textcolor{blue}5 \end{matrix} \right] \left[ \begin{matrix} \textcolor{red} 1 & \textcolor{blue}2 \\ \textcolor{red} 3 & \textcolor{blue}4 \\ \textcolor{red} 5 & \textcolor{blue}6 \\ \textcolor{red} 7 & \textcolor{blue}8 \end{matrix} \right] \end{matrix} \right]
⎣⎢⎢⎡⎣⎢⎢⎡13572468⎦⎥⎥⎤⎣⎢⎢⎡86427531⎦⎥⎥⎤⎣⎢⎢⎡42863175⎦⎥⎥⎤⎣⎢⎢⎡13572468⎦⎥⎥⎤⎦⎥⎥⎤
…
[
[
1
2
3
4
5
6
7
8
]
[
8
7
6
5
4
3
2
1
]
[
4
3
2
1
8
7
6
5
]
[
1
2
3
4
5
6
7
8
]
]
\left[ \begin{matrix} \left[ \begin{matrix} \textcolor{red}1 & \textcolor{blue}2 \\ \textbf\textcolor{green}3 & \textcolor{blue}4 \\ \textbf \textcolor{green}5 & \textcolor{blue}6 \\ \textcolor{red} 7 & \textcolor{blue}8 \end{matrix} \right] \left[ \begin{matrix} \textcolor{red}8 & \textcolor{blue}7 \\ \textbf\textcolor{green}6 & \textcolor{blue}5 \\ \textbf\textcolor{green}4 & \textcolor{blue}3 \\ \textcolor{red}2 & \textcolor{blue}1 \end{matrix} \right] \left[ \begin{matrix} \textcolor{red} 4 & \textcolor{blue}3 \\ \textcolor{red}2 & \textcolor{blue}1 \\ \textcolor{red} 8 & \textcolor{blue}7 \\ \textcolor{red}6 & \textcolor{blue}5 \end{matrix} \right] \left[ \begin{matrix} \textcolor{red} 1 & \textcolor{blue}2 \\ \textcolor{red} 3 & \textcolor{blue}4 \\ \textcolor{red} 5 & \textcolor{blue}6 \\ \textcolor{red} 7 & \textcolor{blue}8 \end{matrix} \right] \end{matrix} \right]
⎣⎢⎢⎡⎣⎢⎢⎡13572468⎦⎥⎥⎤⎣⎢⎢⎡86427531⎦⎥⎥⎤⎣⎢⎢⎡42863175⎦⎥⎥⎤⎣⎢⎢⎡13572468⎦⎥⎥⎤⎦⎥⎥⎤
首先,对第一个channel的第一列和第二列的前两行进行max_pool,得到最大值为8,然后对第二个channel的第一列和第二列的前两行进行max_pool,得到最大值为7,遍历完前两行后, 接着对第一个channel的第一列和第二列的中间两行进行max_pool,的搭配最大值为6,这样依次遍历下去,最终得到的结果是
[
[
8
7
6
6
7
8
]
[
8
7
8
7
8
7
]
[
4
4
8
7
8
8
]
]
\left[ \begin{matrix} \left[ \begin{matrix} \textcolor{red}8 & \textcolor{blue}7 \\ \textcolor{red}6 & \textcolor{blue}6 \\ \textcolor{red}7 & \textcolor{blue}8 \end{matrix} \right] \left[ \begin{matrix} \textcolor{red} 8 & \textcolor{blue}7 \\ \textcolor{red} 8 & \textcolor{blue}7 \\ \textcolor{red} 8 & \textcolor{blue}7 \\ \end{matrix} \right] \left[ \begin{matrix} \textcolor{red} 4 & \textcolor{blue}4 \\ \textcolor{red} 8 & \textcolor{blue}7 \\ \textcolor{red} 8 & \textcolor{blue}8 \end{matrix} \right] \end{matrix} \right]
⎣⎡⎣⎡867768⎦⎤⎣⎡888777⎦⎤⎣⎡488478⎦⎤⎦⎤
max_pool操作中的维度变化
h = [batch_size, height_1, width_1, channels]
ksize = [1, height_2, width_2, 1]
最终得到的pool的shape为
[batch_size, height_1-height_2+1, width_1-width_2+1,channels]
其他文章
tf.nn.conv2d() 函数详解