关于卷积部分:python实现卷积操作
1.池化
池化操作的一个重要目的就是对卷积后得到的特征进行进一步处理, 起到对数据进一步浓缩的效果, 从而缓解计算时内存的压力. 池化会选取一定大小区域, 将该区域内的像素值使用一个代表元素表示. 如果使用平均值代替, 称为平均值池化, 如果使用最大值代替则称为最大值池化.
最大值池化如下图所示,
图片数据经过池化后, 特征维度会减小, 训练参数减少, 泛化能力加强,进而防止过拟合. 其中特征维数的减少并不会让之前的特征丢失.
相关参数:
- 输入大小 B × H × W × C B \times H\times W \times C B×H×W×C
- 池化窗口 N × N N \times N N×N
- 步长
S
S
S
(通常, 我们将步长大小设置为与池化窗口同样大小的参数, 保证滑动区域不会重叠)
池化并不会改变输出通道数, 所以池化后的大小为
B
×
⌊
H
−
N
S
+
1
⌋
×
⌊
W
−
N
S
+
1
⌋
×
C
B \times \lfloor \frac{H - N}{S}+1 \rfloor \times \lfloor \frac{W - N}{S}+1 \rfloor \times C
B×⌊SH−N+1⌋×⌊SW−N+1⌋×C
2. 代码实现
import numpy as np
import math
class Pool2d():
def __init__(self, inputShape, poolingSize, stride=2, type=""):
self.height = inputShape[1]
self.width = inputShape[2]
# self.pool = np.zeros([self.batchSize, poolingSize, poolingSize, inputShape[-1]])
self.stride = stride
self.type = type
self.batchSize = inputShape[0]
self.size = poolingSize
# The shape of the output
self.output = np.zeros([self.batchSize, math.floor((self.height - self.size)/ self.stride + 1),
math.floor((self.width - self.size)/ self.stride + 1), inputShape[-1]])
def forward(self, x):
polOut = np.zeros(self.output.shape)
for i in range(self.batchSize):
img_i = x[i]
polImage_i = self.im2pol(img_i, self.size, self.stride)
polOut[i] = np.reshape(polImage_i, self.output[0].shape)
return polOut
def im2pol(self, image, size, stride):
imagePol = []
for i in range(image.shape[-1]):
tempList = []
for j in range(0, image.shape[0]- size +1, stride):
for k in range(0, image.shape[1], stride):
# Two kinds of pooling
if self.type == "max":
pol = image[j:j + size, k:k + size, i].max()
elif self.type == "mean":
pol = image[j:j + size, k:k + size, i].mean()
tempList.append(pol)
tempArray = np.array(tempList).reshape([self.output.shape[1], self.output.shape[2]])
imagePol.append(tempArray)
imagePol = np.array(imagePol)
imagePol = np.swapaxes(imagePol, 0, 2) # Exchange dimension from (c, h, w) to (h, w, c)
return imagePol
inputData = np.random.random((1, 4, 4, 3))
print("inputShape: ", inputData.shape)
size = 2
print("pooling size: ", size)
pool2d =Pool2d(inputShape=inputData.shape, poolingSize=size, type="max")
outputData = pool2d.forward(inputData)
print("outputShape: ", outputData.shape) # (1, 2, 2, 3)
与卷积操作类似, 池化操作同样是池化窗口在滑动;
但需要注意的是, 它并没有使用零填充.
参考资料:
https://blog.csdn.net/qq_41661809/article/details/96500250
https://blog.csdn.net/qq_28266311/article/details/94555082