使用CNNs网络，基于caltech 101数据集实现分类

最新推荐文章于 2025-03-12 19:22:57 发布

带霸气的骑士

最新推荐文章于 2025-03-12 19:22:57 发布

阅读量4.9k

点赞数 4

分类专栏： python 机器学习文章标签： python tensorflow 图像识别神经网络深度学习

本文链接：https://blog.csdn.net/cough777/article/details/112534121

版权

这篇博客介绍了如何基于Python、TensorFlow和Caltech 101数据集利用GoogLeNet进行图像识别。首先下载并解压数据集，然后将数据转换为npy格式。接着，详细阐述了网络训练过程，并提供了运行环境配置，如Python、Keras等版本。最后，在训练中应用了早停法(early stopping)，最终得到的模型在验证集上的准确率为0.9931。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

首先下载101数据集

caltech 101

把文件解压，路径格式如图所示。

此处是以pycharm为基础（pycharm以当前文件夹的目录为根目录）

路径描述

使用该数据集，制作npy易读文件格式

def data_process(img_size):
    imgs = []
    labels = []
    img_size = img_size
    size = (img_size, img_size)

    for i, category in enumerate(tqdm(categories)):
        for f in os.listdir(path + "/" + categories[i]):
            fullpath = os.path.join(path + "/" + categories[i], f)
            # print(fullpath)
            img = Image.open(fullpath)
            img = np.asarray(img.resize(size, Image.ANTIALIAS))
            # img = np.asarray(img.resize(size)
            if img.shape == (img_size, img_size, 3):
                imgs.append(np.array(img))
                label_curr = i
                labels.append(label_curr)
                # imgs_temp = [imgs, labels]
    np.save(root_path + '/' + 'x'+str(img_size), imgs)
    np.save(root_path + '/' + 'y'+str(img_size), labels)
img_size = 200
full_path =root_path + '/' + 'x'+str(img_size)
if os.path.exists(full_path) is True:
    data_process(img_size)
    print("{} file already exists.".format(full_path))

相应的模块就import一下，后续会展示详细的代码。
此时就会在dataset目录下生成x200.npy和y200.npy两个文件。后续处理的时候，就会直接读取这两个npy文件作为输入数据集。

下面开始说明网络训练过程和导入。

cal_101_googlenet.py

from keras import backend as K
from keras.utils import np_utils
import os
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
from tqdm import tqdm
from modles.googlenet import GoogLeNetBN

# set GPU usage
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
# config.gpu_options.per_process_gpu_memory_fraction = 0.4
set_session(tf.Session(config=config))
# 导入相应的模块以及进行GPU的设置

# 几个超参数的设计
image_size = 200
classes = 101

root_path = 'dataset'
path = 'dataset/Caltech 101/101_ObjectCategories'
categories = sorted(os.listdir(path))
ncategories = len(categories)
print(ncategories)
## 设置数据集的路径以及有多少类

def data_process(img_size):
    imgs = []
    labels = []
    img_size = img_size
    size = (img_size, img_size)

    for i, category in enumerate(tqdm(categories)):
        for f in os.listdir(path + "/" + categories[i]):
            fullpath = os.path.join(path + "/" + categories[i], f)
            # print(fullpath)
            img = Image.open(fullpath)
            img = np.asarray(img.resize(size, Image.ANTIALIAS))
            # img = np.asarray(img.resize(size)
            if img.shape == (img_size, img_size, 3):
                imgs.append(np.array(img))
                label_curr = i
                labels.append(label_curr)
                # imgs_temp = [imgs, labels]
    np.save(root_path

最低0.47元/天解锁文章