深度学习第五天-卷积神经网络（Xception）：动物识别

最新推荐文章于 2023-10-30 23:19:36 发布

acetyl_coa1741

最新推荐文章于 2023-10-30 23:19:36 发布

阅读量758

点赞数

分类专栏：深度学习文章标签：深度学习 cnn 神经网络

本文链接：https://blog.csdn.net/acetyl_coa1741/article/details/126391064

版权

>- **🍨 本文为[🔗365天深度学习训练营](https://mp.weixin.qq.com/s/k-vYaC8l7uxX51WoypLkTw) 中的学习记录博客**
>- **🍦 参考文章地址： [🔗深度学习100例 | 第24天-卷积神经网络（Xception）：动物识别](https://mtyjkh.blog.csdn.net/article/details/120073717)**
>- **🍖 作者：[K同学啊](https://mp.weixin.qq.com/s/k-vYaC8l7uxX51WoypLkTw)**

本文的重点是：

Xception模型的搭建
深度可分离卷积

一，前期工作

1.设置GPU

import tensorflow as tf

gpus = tf.config.list_physical_devices("GPU")

if gpus:
    tf.config.experimental.set_memory_growth(gpus[0], True)  #设置GPU显存用量按需使用
    tf.config.set_visible_devices([gpus[0]],"GPU")

# 打印显卡信息，确认GPU可用
print(gpus)

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

2.导入数据

import matplotlib.pyplot as plt
# 支持中文
plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号

import os,PIL

# 设置随机种子尽可能使结果可以重现
import numpy as np
np.random.seed(1)

# 设置随机种子尽可能使结果可以重现
import tensorflow as tf
tf.random.set_seed(1)

import pathlib

data_dir = "./data"

data_dir = pathlib.Path(data_dir)

3.查看数据

image_count = len(list(data_dir.glob('*/*')))

print("图片总数为：",image_count)

得到结果

图片总数为： 4000

二，数据预处理

1.加载数据

使用image_dataset_from_directory方法将磁盘中的数据加载到tf.data.Dataset中

batch_size = 2
img_height = 299
img_width  = 299

TensorFlow版本是2.2.0的同学可能会遇到module 'tensorflow.keras.preprocessing' has no attribute 'image_dataset_from_directory'的报错，升级一下TensorFlow就OK了。

"""
关于image_dataset_from_directory()的详细介绍可以参考文章：https://mtyjkh.blog.csdn.net/article/details/117018789
"""
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="training",
    seed=12,
    image_size=(img_height, img_width),
    batch_size=batch_size)

得到结果：

Found 4000 files belonging to 4 classes.
Using 3200 files for training.

运行：

"""
关于image_dataset_from_directory()的详细介绍可以参考文章：https://mtyjkh.blog.csdn.net/article/details/117018789
"""
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="validation",
    seed=12,
    image_size=(img_height, img_width),
    batch_size=batch_size)

得到结果：

Found 4000 files belonging to 4 classes.
Using 800 files for validation.

通过class_names输出数据集的标签。标签将按字母顺序对应于目录名称。

class_names = train_ds.class_names
print(class_names)

得到结果：

['cat', 'chook', 'dog', 'horse']

3.再次检查数据

for image_batch, labels_batch in train_ds:
    print(image_batch.shape)
    print(labels_batch.shape)
    break

得到结果：

(2, 299, 299, 3)
(2,)

Image_batch是形状的张量（2, 299, 299, 3)。这是一批形状240x240x3的8张图片（最后一维指的是彩色通道RGB）。
Label_batch是形状（8，）的张量，这些标签对应8张图片

4. 配置数据集

shuffle() ：打乱数据，关于此函数的详细介绍可以参考：https://zhuanlan.zhihu.com/p/42417456
prefetch() ：预取数据，加速运行，其详细介绍可以参考我前两篇文章，里面都有讲解。
cache() ：将数据集缓存到内存当中，加速运行

AUTOTUNE = tf.data.AUTOTUNE

train_ds = (
    train_ds.cache()
    .shuffle(1000)
#     .map(train_preprocessing)    # 这里可以设置预处理函数
#     .batch(batch_size)           # 在image_dataset_from_directory处已经设置了batch_size
    .prefetch(buffer_size=AUTOTUNE)
)

val_ds = (
    val_ds.cache()
    .shuffle(1000)
#     .map(val_preprocessing)    # 这里可以设置预处理函数
#     .batch(batch_size)         # 在image_dataset_from_directory处已经设置了batch_size
    .prefetch(buffer_size=AUTOTUNE)
)