K同学[365天深度学习训练营]第一周记录T1实现mnist手写数字识别、T2彩色图片分类

本文链接：https://blog.csdn.net/afive54/article/details/134705887

参加了K同学的训练营，记录一下实际情况

因为本身有些基础，所以基础营的每周进度就多一点

>- **🍨 本文为[🔗365天深度学习训练营](https://mp.weixin.qq.com/s/rbOOmire8OocQ90QM78DRA) 中的学习记录博客**
>- **🍖 原作者：[K同学啊 | 接辅导、项目定制](https://mtyjkh.blog.csdn.net/)**

我的环境：

- 系统环境：WSL2+Ubuntu22.04

- 语言环境：Python3.8.18

- 编译器：vscode+jupyter notebook

- 深度学习环境：TensorFlow1.15.5→2.10.0

T1：实现mnist手写数字识别

🍺 本周任务：

跑通程序（完成）
了解深度学习是什么（完成）

🍻 拔高（可选）

学习文中提到的函数方法（完成）

GPU设置上，训练营的示例代码为：

import tensorflow as tf
gpus = tf.config.list_physical_devices("GPU")

if gpus:
    gpu0 = gpus[0] #如果有多个GPU，仅使用第0个GPU
    tf.config.experimental.set_memory_growth(gpu0, True) #设置GPU显存用量按需使用
    tf.config.set_visible_devices([gpu0],"GPU")

因我的tensorflow版本问题，故报错：

AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_186980/3925929416.py in <module>
      1 import tensorflow as tf
----> 2 gpus = tf.config.list_physical_devices("GPU")
      3 
      4 if gpus:
      5     gpu0 = gpus[0] #如果有多个GPU，仅使用第0个GPU

~/anaconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow_core/python/util/module_wrapper.py in __getattr__(self, name)
    191   def __getattr__(self, name):
    192     try:
--> 193       attr = getattr(self._tfmw_wrapped_module, name)
    194     except AttributeError:
    195       if not self._tfmw_public_apis:

AttributeError: module 'tensorflow._api.v1.config' has no attribute 'list_physical_devices'

该报错的原因是这段代码在tensorflow1中不支持

可以选择使用tensorflow1的GPU加速代码，

或者升级至tensorflow2

重装tensorflow版本2.10.0后输出为：

2023-12-02 17:37:55.799209: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-12-02 17:37:59.008280: I tensorflow/c/logging.cc:34] Successfully opened dynamic library libdirectml.d6f03b303ac3c4f2eeb8ca631688c9757b361310.so
2023-12-02 17:37:59.008423: I tensorflow/c/logging.cc:34] Successfully opened dynamic library libdxcore.so
2023-12-02 17:37:59.012521: I tensorflow/c/logging.cc:34] Successfully opened dynamic library libd3d12.so
2023-12-02 17:37:59.247338: I tensorflow/c/logging.cc:34] DirectML device enumeration: found 1 compatible adapters.

解决。

不过用2.10的版本跑训练，计算速度比原本1.15.5慢很多，不知道是什么原因，升版本负优化。

T2：彩色图片分类

👉 要求：

学习如何编写一个完整的深度学习程序（完成）
了解分类彩色图片和灰度图片有什么区别（完成）
测试集accuracy到达72%（达到77.26%，完成）

观察模型训练记录：
发现模型的训练准确率在周期10的时候仍在稳定上升

第一个想到的，就是直接调高训练周期至50：

history = model.fit(train_images, train_labels, epochs=50, 
                    validation_data=(test_images, test_labels))

结果如下：

结果发现随着周期数的增加，模型在训练集上准确率不断上升，但在验证集上准确率变化不大。

很明显的过拟合现象，对模型做以下优化：

1在模型构建时加入dropout，随机丢弃部分神经元，以避免过度拟合噪声

2考虑到输入数据图像的像素较少，对最外圈的舍弃会极大的导致图像的边缘特征丢失，于是加入same填充

修改模型代码为：

import tensorflow as tf
from tensorflow.keras import models, layers, regularizers

model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', padding='same', input_shape=(32, 32, 3)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu', padding='same'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu', padding='same'),
    
    layers.Flatten(),
    layers.Dropout(0.4),  # 添加Dropout，概率为0.4
    layers.Dense(64, activation='relu'),
    layers.Dropout(0.4),  # 添加Dropout，概率为0.4
    layers.Dense(10)
])

model.summary()

最终结果为：

313/313 - 1s - loss: 0.7508 - accuracy: 0.7726 - 878ms/epoch - 3ms/step

测试集准确率达到77.26%

完成目标