>- **🍨 本文为[🔗365天深度学习训练营]中的学习记录博客**
>- **🍖 原作者:[K同学啊]**
- 难度:夯实基础⭐⭐
- 语言:Python3、TensorFlow2
- 时间:9月5-9月9日
🍺 要求:
- 自己搭建VGG-16网络框架
- 调用官方的VGG-16网络框架
🍻 拔高(可选):
- 验证集准确率达到100%
- 使用PPT画出VGG-16算法框架图(发论文需要这项技能)
🔎 探索(难度有点大)
- 在不影响准确率的前提下轻量化模型
○ 目前VGG16的Total params是134,276,932
🚀我的环境:
- 语言环境:Python3.11.7
- 编译器:jupyter notebook
- 深度学习框架:TensorFlow2.13.0
一、前期工作
1. 设置GPU
如果使用的是CPU可以忽略这步
import tensorflow as tf
gpus=tf.config.list_physical_devices("GPU")
if gpus:
tf.config.experimental.set_memory_growth(gpus[0],True)
tf.config.set_visible_devices([gpus[0]],"GPU")
gpus
2. 导入数据
import pathlib
data_dir="D:\THE MNIST DATABASE\T7"
data_dir=pathlib.Path(data_dir)
查看图片数量:
image_count=len(list(data_dir.glob('*/*.png')))
print("图片总数为:",image_count)
运行结果:
图片总数为: 1200
二、数据预处理
1. 加载数据
使用image_dataset_from_directory
方法将磁盘中的数据加载到tf.data.Dataset
中
train_ds=tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="training",
seed=123,
image_size=(224,224),
batch_size=32
)
运行结果:
Found 1200 files belonging to 4 classes.
Using 960 files for training.
加载验证集:
val_ds=tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="validation",
seed=123,
image_size=(224,224),
batch_size=32
)
运行结果:
Found 1200 files belonging to 4 classes.
Using 240 files for validation.
通过class_names输出数据集的标签。标签将按字母顺序对应于目录名称。
class_names=train_ds.class_names
print(class_names)
运行结果:
['Dark', 'Green', 'Light', 'Medium']
2. 可视化数据
import matplotlib.pyplot as plt
plt.figure(figsize=(10,4))
for images,labels in train_ds.take(1):
for i in range(10):
ax=plt.subplot(2,5,i+1)
plt.imshow(images[i].numpy().astype("uint8"))
plt.title(class_names[labels[i]])
plt.axis("off")
运行结果:
查看图像格式:
for image_batch,labels_batch in train_ds:
print(image_batch.shape)
print(labels_batch.shape)
break
运行结果:
(32, 224, 224, 3)
(32,)
3. 配置数据集
- shuffle() :打乱数据,关于此函数的详细介绍可以参考:https://zhuanlan.zhihu.com/p/42417456
- prefetch() :预取数据,加速运行,其详细介绍可以参考我前两篇文章,里面都有讲解。
- cache() :将数据集缓存到内存当中,加速运行
对数据集进行预处理,对于验证集只进行了缓存和预取操作,没有进行打乱操作。
AUTOTUNE=tf.data.AUTOTUNE
train_ds=train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds=val_ds.cache().prefetch(buffer_size=AUTOTUNE)
对图像数据集进行归一化。 将输入数据除以255,将像素值缩放到0到1之间。然后,使用map
函数将这个归一化层应用到训练数据集train_ds
和验证数据集val_ds
的每个样本上。这样,所有的图像都会被归一化,以便在神经网络中更好地处理。
from tensorflow.keras import layers
normalization_layer=layers.experimental.preprocessing.Rescaling(1./255)
train_ds=train_ds.map(lambda x,y:(normalization_layer(x),y))
val_ds=val_ds.map(lambda x,y:(normalization_layer(x),y))
从验证数据集中获取一个批次的图像和标签,然后将第一个图像存储在变量first_image中。接下来,使用numpy库的min和max函数分别计算first_image中的最小值和最大值,并将它们打印出来。这样可以帮助我们了解图像数据的归一化情况,例如是否所有像素值都在0到1之间。
import numpy as np
image_batch,labels_batch=next(iter(val_ds))
first_image=image_batch[0]
#查看归一化后的数据
print(np.min(first_image),np.max(first_image))
运行结果:
0.0 1.0
三、构建VGG-16网络
VGG优缺点分析:
- VGG优点
VGG的结构非常简洁,整个网络都使用了同样大小的卷积核尺寸(3x3)
和最大池化尺寸(2x2)
。
- VGG缺点
1)训练时间过长,调参难度大。
2)需要的存储容量大,不利于部署。例如存储VGG-16
权重值文件的大小为500多MB,不利于安装到嵌入式系统中。
1. 自建模型
from tensorflow.keras import layers,models,Input
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Conv2D,MaxPooling2D,Dense,Flatten,Dropout
def vgg16(nb_classes,input_shape):
input_tensor=Input(shape=input_shape)
#1st block
x=Conv2D(64,(3,3),activation='relu',padding='same')(input_tensor)
x=Conv2D(64,(3,3),activation='relu',padding='same')(x)
x=MaxPooling2D((2,2),strides=(2,2))(x)
#2nd block
x=Conv2D(128,(3,3),activation='relu',padding='same')(x)
x=Conv2D(128,(3,3),activation='relu',padding='same')(x)
x=MaxPooling2D((2,2),strides=(2,2))(x)
#3rd block
x=Conv2D(256,(3,3),activation='relu',padding='same')(x)
x=Conv2D(256,(3,3),activation='relu',padding='same')(x)
x=Conv2D(256,(3,3),activation='relu',padding='same')(x)
x=MaxPooling2D((2,2),strides=(2,2))(x)
#4th block
x=Conv2D(512,(3,3),activation='relu',padding='same')(x)
x=Conv2D(512,(3,3),activation='relu',padding='same')(x)
x=Conv2D(512,(3,3),activation='relu',padding='same')(x)
x=MaxPooling2D((2,2),strides=(2,2))(x)
#5th block
x=Conv2D(512,(3,3),activation='relu',padding='same')(x)
x=Conv2D(512,(3,3),activation='relu',padding='same')(x)
x=Conv2D(512,(3,3),activation='relu',padding='same')(x)
x=MaxPooling2D((2,2),strides=(2,2))(x)
#full connection
x=Flatten()(x)
x=Dense(4096,activation='relu')(x)
x=Dense(4096,activation='relu')(x)
output_tensor=Dense(nb_classes,activation='softmax',name='predictions')(x)
model=Model(input_tensor,output_tensor)
return model
model=vgg16(len(class_names),(224,224,3))
model.summary()
运行结果:
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 224, 224, 3)] 0
conv2d (Conv2D) (None, 224, 224, 64) 1792
conv2d_1 (Conv2D) (None, 224, 224, 64) 36928
max_pooling2d (MaxPooling2 (None, 112, 112, 64) 0
D)
conv2d_2 (Conv2D) (None, 112, 112, 128) 73856
conv2d_3 (Conv2D) (None, 112, 112, 128) 147584
max_pooling2d_1 (MaxPoolin (None, 56, 56, 128) 0
g2D)
conv2d_4 (Conv2D) (None, 56, 56, 256) 295168
conv2d_5 (Conv2D) (None, 56, 56, 256) 590080
conv2d_6 (Conv2D) (None, 56, 56, 256) 590080
max_pooling2d_2 (MaxPoolin (None, 28, 28, 256) 0
g2D)
conv2d_7 (Conv2D) (None, 28, 28, 512) 1180160
conv2d_8 (Conv2D) (None, 28, 28, 512) 2359808
conv2d_9 (Conv2D) (None, 28, 28, 512) 2359808
max_pooling2d_3 (MaxPoolin (None, 14, 14, 512) 0
g2D)
conv2d_10 (Conv2D) (None, 14, 14, 512) 2359808
conv2d_11 (Conv2D) (None, 14, 14, 512) 2359808
conv2d_12 (Conv2D) (None, 14, 14, 512) 2359808
max_pooling2d_4 (MaxPoolin (None, 7, 7, 512) 0
g2D)
flatten (Flatten) (None, 25088) 0
dense (Dense) (None, 4096) 102764544
dense_1 (Dense) (None, 4096) 16781312
predictions (Dense) (None, 4) 16388
=================================================================
Total params: 134276932 (512.23 MB)
Trainable params: 134276932 (512.23 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
2. 网络结构图
关于卷积的相关知识可以参考文章:卷积的计算_卷积核2x2-CSDN博客
结构说明:
- 13个卷积层(Convolutional Layer),分别用
blockX_convX
表示
- 3个全连接层(Fully connected Layer),分别用
fcX
与predictions
表示
- 5个池化层(Pool layer),分别用
blockX_pool
表示
VGG-16
包含了16个隐藏层(13个卷积层和3个全连接层),故称为VGG-16
四、编译
在准备对模型进行训练之前,还需要再对其进行一些设置。以下内容是在模型的编译步骤中添加的:
- 损失函数(loss):用于衡量模型在训练期间的准确率。
- 优化器(optimizer):决定模型如何根据其看到的数据和自身的损失函数进行更新。
- 指标(metrics):用于监控训练和测试步骤。以下示例使用了准确率,即被正确分类的图像的比率。
#设置初始学习率
initial_learning_rate=1e-4
lr_schedule=tf.keras.optimizers.schedules.ExponentialDecay(
initial_learning_rate,
decay_steps=30,
decay_rate=0.92,
staircase=True
)
#设置优化器
opt=tf.keras.optimizers.Adam(learning_rate=initial_learning_rate)
model.compile(optimizer=opt,
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
五、训练模型
epochs=20
history=model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs
)
运行结果:
Epoch 1/20
30/30 [==============================] - 410s 13s/step - loss: 1.2716 - accuracy: 0.3740 - val_loss: 0.8226 - val_accuracy: 0.6208
Epoch 2/20
30/30 [==============================] - 381s 13s/step - loss: 0.6675 - accuracy: 0.7177 - val_loss: 0.5283 - val_accuracy: 0.8250
Epoch 3/20
30/30 [==============================] - 381s 13s/step - loss: 0.4723 - accuracy: 0.8167 - val_loss: 0.3166 - val_accuracy: 0.8792
Epoch 4/20
30/30 [==============================] - 382s 13s/step - loss: 0.4238 - accuracy: 0.8479 - val_loss: 0.5151 - val_accuracy: 0.8583
Epoch 5/20
30/30 [==============================] - 382s 13s/step - loss: 0.3667 - accuracy: 0.8687 - val_loss: 0.1793 - val_accuracy: 0.9500
Epoch 6/20
30/30 [==============================] - 380s 13s/step - loss: 0.2302 - accuracy: 0.9198 - val_loss: 0.1420 - val_accuracy: 0.9625
Epoch 7/20
30/30 [==============================] - 382s 13s/step - loss: 0.1465 - accuracy: 0.9510 - val_loss: 0.1185 - val_accuracy: 0.9750
Epoch 8/20
30/30 [==============================] - 383s 13s/step - loss: 0.1396 - accuracy: 0.9615 - val_loss: 0.2551 - val_accuracy: 0.9250
Epoch 9/20
30/30 [==============================] - 380s 13s/step - loss: 0.1657 - accuracy: 0.9375 - val_loss: 0.1414 - val_accuracy: 0.9667
Epoch 10/20
30/30 [==============================] - 391s 13s/step - loss: 0.0741 - accuracy: 0.9729 - val_loss: 0.1156 - val_accuracy: 0.9542
Epoch 11/20
30/30 [==============================] - 393s 13s/step - loss: 0.1185 - accuracy: 0.9573 - val_loss: 0.1389 - val_accuracy: 0.9500
Epoch 12/20
30/30 [==============================] - 394s 13s/step - loss: 0.0769 - accuracy: 0.9729 - val_loss: 0.0815 - val_accuracy: 0.9667
Epoch 13/20
30/30 [==============================] - 396s 13s/step - loss: 0.1232 - accuracy: 0.9719 - val_loss: 0.1685 - val_accuracy: 0.9625
Epoch 14/20
30/30 [==============================] - 393s 13s/step - loss: 0.0919 - accuracy: 0.9740 - val_loss: 0.2258 - val_accuracy: 0.9125
Epoch 15/20
30/30 [==============================] - 395s 13s/step - loss: 0.0513 - accuracy: 0.9792 - val_loss: 0.0464 - val_accuracy: 0.9833
Epoch 16/20
30/30 [==============================] - 395s 13s/step - loss: 0.0208 - accuracy: 0.9917 - val_loss: 0.1353 - val_accuracy: 0.9583
Epoch 17/20
30/30 [==============================] - 396s 13s/step - loss: 0.0246 - accuracy: 0.9906 - val_loss: 0.1329 - val_accuracy: 0.9667
Epoch 18/20
30/30 [==============================] - 395s 13s/step - loss: 0.0280 - accuracy: 0.9896 - val_loss: 0.0882 - val_accuracy: 0.9708
Epoch 19/20
30/30 [==============================] - 395s 13s/step - loss: 0.0489 - accuracy: 0.9865 - val_loss: 0.1122 - val_accuracy: 0.9667
Epoch 20/20
30/30 [==============================] - 414s 14s/step - loss: 0.1912 - accuracy: 0.9469 - val_loss: 0.1024 - val_accuracy: 0.9708
六、可视化结果
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs_range = range(epochs)
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
运行结果:
七、模型修改
基于前置任务修改模型,要求:
1、轻量化模型;
2、提升模型验证集的准确率。
考虑vgg16模型中,大多参数堆叠于全连接层,故而考虑减少全连接层的参数,故将三层全连接层进行优化,如下:
#full connection
x=Flatten()(x)
x=Dense(1024,activation='relu')(x)
x=Dense(108,activation='relu')(x)
output_tensor=Dense(nb_classes,activation='softmax',name='predictions')(x)
查看模型结果:
Model: "model_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_3 (InputLayer) [(None, 224, 224, 3)] 0
conv2d_26 (Conv2D) (None, 224, 224, 64) 1792
conv2d_27 (Conv2D) (None, 224, 224, 64) 36928
max_pooling2d_10 (MaxPooli (None, 112, 112, 64) 0
ng2D)
conv2d_28 (Conv2D) (None, 112, 112, 128) 73856
conv2d_29 (Conv2D) (None, 112, 112, 128) 147584
max_pooling2d_11 (MaxPooli (None, 56, 56, 128) 0
ng2D)
conv2d_30 (Conv2D) (None, 56, 56, 256) 295168
conv2d_31 (Conv2D) (None, 56, 56, 256) 590080
conv2d_32 (Conv2D) (None, 56, 56, 256) 590080
max_pooling2d_12 (MaxPooli (None, 28, 28, 256) 0
ng2D)
conv2d_33 (Conv2D) (None, 28, 28, 512) 1180160
conv2d_34 (Conv2D) (None, 28, 28, 512) 2359808
conv2d_35 (Conv2D) (None, 28, 28, 512) 2359808
max_pooling2d_13 (MaxPooli (None, 14, 14, 512) 0
ng2D)
conv2d_36 (Conv2D) (None, 14, 14, 512) 2359808
conv2d_37 (Conv2D) (None, 14, 14, 512) 2359808
conv2d_38 (Conv2D) (None, 14, 14, 512) 2359808
max_pooling2d_14 (MaxPooli (None, 7, 7, 512) 0
ng2D)
flatten_2 (Flatten) (None, 25088) 0
dense_4 (Dense) (None, 1024) 25691136
dense_5 (Dense) (None, 108) 110700
predictions (Dense) (None, 4) 436
=================================================================
Total params: 40516960 (154.56 MB)
Trainable params: 40516960 (154.56 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
削减了将近2/3的参数,模型由原来的500多MB减少为现在的150MB。
再次运行模型,结果如下:
Epoch 1/20
30/30 [==============================] - 357s 12s/step - loss: 1.3838 - accuracy: 0.2552 - val_loss: 1.3614 - val_accuracy: 0.2125
Epoch 2/20
30/30 [==============================] - 352s 12s/step - loss: 1.0836 - accuracy: 0.4198 - val_loss: 0.6425 - val_accuracy: 0.5750
Epoch 3/20
30/30 [==============================] - 353s 12s/step - loss: 0.7003 - accuracy: 0.6073 - val_loss: 0.7149 - val_accuracy: 0.5708
Epoch 4/20
30/30 [==============================] - 354s 12s/step - loss: 0.6017 - accuracy: 0.6927 - val_loss: 0.6296 - val_accuracy: 0.6833
Epoch 5/20
30/30 [==============================] - 354s 12s/step - loss: 0.4709 - accuracy: 0.7667 - val_loss: 0.3687 - val_accuracy: 0.8458
Epoch 6/20
30/30 [==============================] - 353s 12s/step - loss: 0.3756 - accuracy: 0.8240 - val_loss: 0.3689 - val_accuracy: 0.8750
Epoch 7/20
30/30 [==============================] - 353s 12s/step - loss: 0.1816 - accuracy: 0.9365 - val_loss: 0.2323 - val_accuracy: 0.9167
Epoch 8/20
30/30 [==============================] - 352s 12s/step - loss: 0.1583 - accuracy: 0.9417 - val_loss: 0.1248 - val_accuracy: 0.9625
Epoch 9/20
30/30 [==============================] - 353s 12s/step - loss: 0.0802 - accuracy: 0.9719 - val_loss: 0.1001 - val_accuracy: 0.9625
Epoch 10/20
30/30 [==============================] - 353s 12s/step - loss: 0.0710 - accuracy: 0.9698 - val_loss: 0.0279 - val_accuracy: 0.9917
Epoch 11/20
30/30 [==============================] - 352s 12s/step - loss: 0.1886 - accuracy: 0.9250 - val_loss: 0.3112 - val_accuracy: 0.8625
Epoch 12/20
30/30 [==============================] - 352s 12s/step - loss: 0.0653 - accuracy: 0.9719 - val_loss: 0.1767 - val_accuracy: 0.9542
Epoch 13/20
30/30 [==============================] - 352s 12s/step - loss: 0.1293 - accuracy: 0.9583 - val_loss: 0.0622 - val_accuracy: 0.9792
Epoch 14/20
30/30 [==============================] - 353s 12s/step - loss: 0.0319 - accuracy: 0.9927 - val_loss: 0.0325 - val_accuracy: 0.9917
Epoch 15/20
30/30 [==============================] - 353s 12s/step - loss: 0.0158 - accuracy: 0.9958 - val_loss: 0.0357 - val_accuracy: 0.9917
Epoch 16/20
30/30 [==============================] - 354s 12s/step - loss: 0.0109 - accuracy: 0.9958 - val_loss: 0.0232 - val_accuracy: 0.9958
Epoch 17/20
30/30 [==============================] - 353s 12s/step - loss: 0.0160 - accuracy: 0.9948 - val_loss: 0.0508 - val_accuracy: 0.9917
Epoch 18/20
30/30 [==============================] - 354s 12s/step - loss: 0.0614 - accuracy: 0.9802 - val_loss: 0.1200 - val_accuracy: 0.9583
Epoch 19/20
30/30 [==============================] - 354s 12s/step - loss: 0.0594 - accuracy: 0.9833 - val_loss: 0.0754 - val_accuracy: 0.9792
Epoch 20/20
30/30 [==============================] - 353s 12s/step - loss: 0.0848 - accuracy: 0.9708 - val_loss: 0.1904 - val_accuracy: 0.9417
可以看出,模型最高达到99.58%,接近100%。在轻量化的同时提升了模型在验证集的准确率。
八、心得体会
在本次项目中,体会了自己搭建vgg16模型的过程,对模型的整体有了充分的理解。最后实现了模型轻量化的任务,使模型参数数量减少了2/3,同时提升了模型在验证集的准确率,准确率达到了99.58%。但由于一直是用CPU跑模型,耗时太长,没有完成验证集准确率达到100%的任务,希望今后能够在GPU模式下完成模型调整和参数等调整,再次完成验证集准确率100%的任务。