paddle卷积神经网络实现Cifar10数据集解析
卷积神经网络解析
本项目把几大重要的卷积神经网络进行了解析使用了Cifar10
项目是陆平老师的,解析采取了由上至下的方式,上面的解析详细,下面的可能没有标注
如果有疑问可以留言或私聊我都可以。
案例一:AlexNet网络
AlexNet模型由Alex Krizhevsky、Ilya Sutskever和Geoffrey E. Hinton开发,是2012年ImageNet挑战赛冠军模型。相比于LeNet模型,AlexNet的神经网络层数更多,其中包含ReLU激活层,并且在全连接层引入Dropout机制防止过拟合。下面详细解析AlexNet模型。
原项目传送门
import paddle
import paddle.nn.functional as F
import numpy as np
from paddle.vision.transforms import Compose, Resize, Transpose, Normalize
#准备数据
t = Compose([Resize(size=227),Normalize(mean=[127.5, 127.5, 127.5], std=[127.5, 127.5, 127.5], data_format='HWC'),Transpose()]) # 数据归一化处理
cifar10_train = paddle.vision.datasets.cifar.Cifar10(mode='train', transform=t, backend='cv2')
cifar10_test = paddle.vision.datasets.cifar.Cifar10(mode="test", transform=t, backend='cv2')
Cache file /home/aistudio/.cache/paddle/dataset/cifar/cifar-10-python.tar.gz not found, downloading https://dataset.bj.bcebos.com/cifar/cifar-10-python.tar.gz
Begin to download
Download finished
for i in cifar10_train:
print(type(i))
print(i[0].shape)
print(i)
break
<class 'tuple'>
(3, 227, 227)
(array([[[ 0.39607844, 0.39607844, 0.39607844, ..., 0.29411766,
0.29411766, 0.29411766],
[ 0.39607844, 0.39607844, 0.39607844, ..., 0.29411766,
0.29411766, 0.29411766],
[ 0.39607844, 0.39607844, 0.39607844, ..., 0.29411766,
0.29411766, 0.29411766],
...,
[-0.19215687, -0.19215687, -0.19215687, ..., -0.28627452,
-0.28627452, -0.28627452],
[-0.19215687, -0.19215687, -0.19215687, ..., -0.28627452,
-0.28627452, -0.28627452],
[-0.19215687, -0.19215687, -0.19215687, ..., -0.28627452,
-0.28627452, -0.28627452]],
[[ 0.38039216, 0.38039216, 0.38039216, ..., 0.2784314 ,
0.2784314 , 0.2784314 ],
[ 0.38039216, 0.38039216, 0.38039216, ..., 0.2784314 ,
0.2784314 , 0.2784314 ],
[ 0.38039216, 0.38039216, 0.38039216, ..., 0.2784314 ,
0.2784314 , 0.2784314 ],
...,
[-0.24705882, -0.24705882, -0.24705882, ..., -0.34117648,
-0.34117648, -0.34117648],
[-0.24705882, -0.24705882, -0.24705882, ..., -0.34117648,
-0.34117648, -0.34117648],
[-0.24705882, -0.24705882, -0.24705882, ..., -0.34117648,
-0.34117648, -0.34117648]],
[[ 0.48235294, 0.48235294, 0.48235294, ..., 0.3647059 ,
0.3647059 , 0.3647059 ],
[ 0.48235294, 0.48235294, 0.48235294, ..., 0.3647059 ,
0.3647059 , 0.3647059 ],
[ 0.48235294, 0.48235294, 0.48235294, ..., 0.3647059 ,
0.3647059 , 0.3647059 ],
...,
[-0.2784314 , -0.2784314 , -0.2784314 , ..., -0.39607844,
-0.39607844, -0.39607844],
[-0.2784314 , -0.2784314 , -0.2784314 , ..., -0.39607844,
-0.39607844, -0.39607844],
[-0.2784314 , -0.2784314 , -0.2784314 , ..., -0.39607844,
-0.39607844, -0.39607844]]], dtype=float32), array(0))
AlexNet卷积网络解析
卷积操作Conv2D
paddle.nn.Conv2D
涉及到的参数
in_channels (int) - 输入图像的通道数。
out_channels (int) - 由卷积操作产生的输出的通道数。
kernel_size (int|list|tuple) - 卷积核大小。可以为单个整数或包含两个整数的元组或列表,分别表示卷积核的高和宽。如果为单个整数,表示卷积核的高和宽都等于该整数。
stride (int|list|tuple,可选) - 步长大小。可以为单个整数或包含两个整数的元组或列表,分别表示卷积沿着高和宽的步长。如果为单个整数,表示沿着高和宽的步长都等于该整数。默认值:1。
padding (int|list|tuple|str,可选) - 填充大小。
以paddle.nn.Conv2D(3,96,11,4,0)
为例进行解析
3:输入为三通道
96:输出的通道数
11:卷积核大小(fw = fh = 11 )
4:步长大小 (s=4)
0:填充大小 (p=0)
输入大小:3 * 227 * 227(xw = xh = 227)
按照计算公式
new_w = (227+0-11)/4 +1 = 55
new_h同理等于55
输出大小等于 输出的通道数 * new_w * new_y = 96 * 55 * 55
池化操作
paddle.nn.MaxPool2D
(最大池化)
主要数据:
kernel_size (int|list|tuple): 池化核大小。如果它是一个元组或列表,它必须包含两个整数值, (pool_size_Height, pool_size_Width)。若为一个整数,则它的平方值将作为池化核大小,比如若pool_size=2, 则池化核大小为2x2。
stride (int|list|tuple):池化层的步长。如果它是一个元组或列表,它将包含两个整数,(pool_stride_Height, pool_stride_Width)。若为一个整数,则表示H和W维度上stride均为该值。默认值为kernel_size.
padding (string|int|list|tuple) 池化填充。
输出大小:w = h = (55+0-3)/2 +1 = 27
#构建模型
class AlexNetModel(paddle.nn.Layer):
def __init__(self):
super(AlexNetModel, self).__init__()
self.conv_pool1 = paddle.nn.Sequential( # 输入大小m*3*227*227
paddle.nn.Conv2D(3,96,11,4,0), # L1, 输出大小m*96*55*55
paddle.nn.ReLU(), # L2, 输出大小m*96*55*55
paddle.nn.MaxPool2D(kernel_size=3, stride=2)) # L3, 输出大小m*96*27*27
self.conv_pool2 = paddle.nn.Sequential(
paddle.nn.Conv2D(96, 256, 5, 1, 2), # L4, 输出大小m*256*27*27
paddle.nn.ReLU(), # L5, 输出大小m*256*27*27
paddle.nn.MaxPool2D(3, 2)) # L6, 输出大小m*256*13*13
self.conv_pool3 = paddle.nn.Sequential(
paddle.nn.Conv2D(256, 384, 3, 1, 1), # L7, 输出大小m*384*13*13
paddle.nn.ReLU()) # L8, 输出m*384*13*13
self.conv_pool4 = paddle.nn.Sequential(
paddle.nn.Conv2D(384, 384, 3, 1, 1),# L9, 输出大小m*384*13*13
paddle.nn.ReLU()) # L10, 输出大小m*384*13*13
self.conv_pool5 = paddle.nn.Sequential(
paddle.nn.Conv2D(384, 256, 3, 1, 1),# L11, 输出大小m*256*13*13
paddle.nn.ReLU(), # L12, 输出大小m*256*13*13
paddle.nn.MaxPool2D(3, 2)) # L13, 输出大小m*256*6*6
self.full_conn = paddle.nn.Sequential(
paddle.nn.Linear(256*6*6, 4096), # L14, 输出大小m*4096
paddle.nn.ReLU(), # L15, 输出大小m*4096
paddle.nn.Dropout(0.5), # L16, 输出大小m*4096
paddle.nn.Linear(4096, 4096), # L17, 输出大小m*4096
paddle.nn.ReLU(), # L18, 输出大小m*4096
paddle.nn.Dropout(0.5), # L19, 输出大小m*4096
paddle.nn.Linear(4096, 10)) # L20, 输出大小m*10
self.flatten=paddle.nn.Flatten()
def forward(self, x): # 前向传播
x = self.conv_pool1(x)
x = self.conv_pool2(x)
x = self.conv_pool3(x)
x = self.conv_pool4(x)
x = self.conv_pool5(x)
x = self.flatten(x)
x = self.full_conn(x)
return x
epoch_num = 2
batch_size = 256
learning_rate = 0.0001
val_acc_history = []
val_loss_history = []
model = AlexNetModel()
paddle.summary(model,(1,3,227,227))
---------------------------------------------------------------------------
Layer (type) Input Shape Output Shape Param #
===========================================================================
Conv2D-1 [[1, 3, 227, 227]] [1, 96, 55, 55] 34,944
ReLU-1 [[1, 96, 55, 55]] [1, 96, 55, 55] 0
MaxPool2D-1 [[1, 96, 55, 55]] [1, 96, 27, 27] 0
Conv2D-2 [[1, 96, 27, 27]] [1, 256, 27, 27] 614,656
ReLU-2 [[1, 256, 27, 27]] [1, 256, 27, 27] 0
MaxPool2D-2 [[1, 256, 27, 27]] [1, 256, 13, 13] 0
Conv2D-3 [[1, 256, 13, 13]] [1, 384, 13, 13] 885,120
ReLU-3 [[1, 384, 13, 13]] [1, 384, 13, 13] 0
Conv2D-4 [[1, 384, 13, 13]] [1, 384, 13, 13] 1,327,488
ReLU-4 [[1, 384, 13, 13]] [1, 384, 13, 13] 0
Conv2D-5 [[1, 384, 13, 13]] [1, 256, 13, 13] 884,992
ReLU-5 [[1, 256, 13, 13]] [1, 256, 13, 13] 0
MaxPool2D-3 [[1, 256, 13, 13]] [1, 256, 6, 6] 0
Flatten-1 [[1, 256, 6, 6]] [1, 9216] 0
Linear-1 [[1, 9216]] [1, 4096] 37,752,832
ReLU-6 [[1, 4096]] [1, 4096] 0
Dropout-1 [[1, 4096]] [1, 4096] 0
Linear-2 [[1, 4096]] [1, 4096] 16,781,312
ReLU-7 [[1, 4096]] [1, 4096] 0
Dropout-2 [[1, 4096]] [1, 4096] 0
Linear-3 [[1, 4096]] [1, 10] 40,970
===========================================================================
Total params: 58,322,314
Trainable params: 58,322,314
Non-trainable params: 0
---------------------------------------------------------------------------
Input size (MB): 0.59
Forward/backward pass size (MB): 11.11
Params size (MB): 222.48
Estimated Total Size (MB): 234.18
---------------------------------------------------------------------------
{'total_params': 58322314, 'trainable_params': 58322314}
def train(model):
#启动训练模式
model.train()
opt = paddle.optimizer.Adam(learning_rate=learning_rate, parameters=model.parameters()) # 优化器Adam
train_loader = paddle.io.DataLoader(cifar10_train, shuffle=True, batch_size=batch_size) # 数据集乱序处理
valid_loader = paddle.io.DataLoader(cifar10_test, batch_size=batch_size)
for epoch in range(epoch_num): # 训练轮数
for batch_id, data in enumerate(train_loader()): # 训练集拆包
x_data = paddle.cast(data[0], 'float32') # 转换数据类型
y_data = paddle.cast(data[1], 'int64')
y_data = paddle.reshape