365天深度学习训练营] 第7周：咖啡豆识别

最新推荐文章于 2022-12-05 20:36:19 发布

Jessica2017lj

最新推荐文章于 2022-12-05 20:36:19 发布

阅读量774

点赞数 1

文章标签：深度学习 tensorflow python

本文链接：https://blog.csdn.net/Jessica2017lj/article/details/126789100

版权

本文记录了365天深度学习训练营中关于咖啡豆识别的任务，包括设置GPU，使用TensorFlow加载和预处理数据，构建VGG-16模型，训练并优化模型，最终提升验证集准确率至97%。

摘要由CSDN通过智能技术生成

前期工作

本文为🔗365天深度学习训练营内部限免文章
参考本文所写记录性文章，请在文章开头保留以下内容

🍨 本文为🔗365天深度学习训练营中的学习记录博客

🍦 参考文章：365天深度学习训练营-第7周：咖啡豆识别（训练营内部成员可读）

🍖 原作者：K同学啊|接辅导、项目定制

🏡 我的环境：

语言环境：Python3.6.5
编译器：jupyter lab
深度学习环境：TensorFlow2.4.1

⏲往期文章：

5天学习计划-第6周：好莱坞明星识别
5天学习计划-第5周：运动鞋品牌识别
难度：夯实基础
语言：Python3、TensorFlow2
时间：9月5-9月9日

🍺 要求：

自己搭建VGG-16网络框架
调用官方的VGG-16网络框架

🍻 拔高（可选）：

验证集准确率达到100%
使用PPT画出VGG-16算法框架图（发论文需要这项技能）

🔎 探索（难度有点大）

在不影响准确率的前提下轻量化模型

设置GPU

如果使用的是CPU可以忽略这步

import tensorflow as tf

gpus = tf.config.list_physical_devices("GPU")

if gpus:
    tf.config.experimental.set_memory_growth(gpus[0], True)  #设置GPU显存用量按需使用
    tf.config.set_visible_devices([gpus[0]],"GPU")

导入数据

from tensorflow       import keras
from tensorflow.keras import layers,models
import numpy             as np
import matplotlib.pyplot as plt
import os,PIL,pathlib

# 这里需要更换成相应的地址
data_dir = "./49-data/"
data_dir = pathlib.Path(data_dir)

image_count = len(list(data_dir.glob('*/*.png')))

print("图片总数为：",image_count)

图片总数为： 1200

import torch
import torch.nn as nn
import os,PIL,pathlib
from PIL import Image
from torchvision import transforms, datasets

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

数据预处理

加载数据

使用image_dataset_from_directory方法将磁盘中的数据加载到tf.data.Dataset中

batch_size = 32
img_height = 224
img_width = 224

"""
关于image_dataset_from_directory()的详细介绍可以参考文章：https://mtyjkh.blog.csdn.net/article/details/117018789
"""
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="training",
    seed=123,
    image_size=(img_height, img_width),
    batch_size=batch_size)

Found 1200 files belonging to 4 classes.
Using 960 files for training.

"""
关于image_dataset_from_directory()的详细介绍可以参考文章：https://mtyjkh.blog.csdn.net/article/details/117018789
"""
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="validation",
    seed=123,
    image_size=(img_height, img_width),
    batch_size=batch_size)

Found 1200 files belonging to 4 classes.
Using 240 files for validation.

我们可以通过class_names输出数据集的标签。标签将按字母顺序对应于目录名称。

class_names = train_ds.class_names
print(class_names)

['Dark', 'Green', 'Light', 'Medium']

可视化数据

plt.figure(figsize=(10, 4))  # 图形的宽为10高为5

for images, labels in train_ds.take(1):
    for i in range(10):
        
        ax = plt.subplot(2, 5, i + 1)  

        plt.imshow(images[i].numpy().astype("uint8"))
        plt.title(class_names[labels[i]])
        
        plt.axis("off")

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-sIqwsVoa-1662728111868)(output_21_0.png)]

for image_batch, labels_batch in train_ds:
    print(image_batch.shape)
    print(labels_batch.shape)
    break

(32, 224, 224, 3)
(32,)

配置数据集

shuffle() ：打乱数据，关于此函数的详细介绍可以参考：https://zhuanlan.zhihu.com/p/42417456
prefetch() ：预取数据，加速运行，其详细介绍可以参考我前两篇文章，里面都有讲解。
cache() ：将数据集缓存到内存当中，加速运行

AUTOTUNE = tf.data.AUTOTUNE

train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds   = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

normalization_layer = layers.experimental.preprocessing.Rescaling(1./<

最低0.47元/天解锁文章

Jessica2017lj

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
365天深度学习训练营] 第7周：咖啡豆识别

VGG16图像识别应用
复制链接

扫一扫