使用Python实现图像分类与识别模型

Echo_Wish

于 2024-04-22 08:47:51 发布

阅读量357

点赞数 6

分类专栏：从零开始学Python人工智能 Python 笔记文章标签： python 分类开发语言

本文链接：https://blog.csdn.net/weixin_46178278/article/details/138058273

版权

Python 笔记同时被 2 个专栏收录

189 篇文章 7 订阅

订阅专栏

从零开始学Python人工智能

34 篇文章 21 订阅

订阅专栏

图像分类与识别是计算机视觉中的重要任务，它可以帮助我们自动识别图像中的对象、场景或者特征。在本文中，我们将介绍图像分类与识别的基本原理和常见的实现方法，并使用Python来实现这些模型。

什么是图像分类与识别？

图像分类与识别是指将图像自动分类到预定义的类别中，或者识别图像中的对象、场景或特征的任务。例如，可以将猫和狗的图像分类到不同的类别中，或者识别图像中的人脸或车辆等。

图像分类与识别模型

1. 卷积神经网络（CNN）

卷积神经网络是一种在图像分类与识别任务中表现优异的深度学习模型。它通过交替使用卷积层、池化层和全连接层来提取图像特征并进行分类。在Python中，我们可以使用Keras库来实现卷积神经网络模型：

from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from keras.preprocessing.image import ImageDataGenerator

# 创建卷积神经网络模型
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(64, 64, 3)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

# 编译模型
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# 准备示例数据集
train_datagen = ImageDataGenerator(rescale=1./255, shear_range=0.2, zoom_range=0.2, horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory('train', target_size=(64, 64), batch_size=32, class_mode='binary')
test_generator = test_datagen.flow_from_directory('test', target_size=(64, 64), batch_size=32, class_mode='binary')

# 训练模型
model.fit(train_generator, steps_per_epoch=len(train_generator), epochs=10, validation_data=test_generator, validation_steps=len(test_generator))

2. 预训练模型

除了自己构建卷积神经网络模型外，我们还可以使用预训练的模型来进行图像分类与识别。预训练的模型已经在大规模图像数据上进行了训练，可以直接用于我们的任务。常见的预训练模型包括VGG、ResNet、Inception等。在Python中，我们可以使用Keras库加载并使用这些预训练模型：

from keras.applications import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input, decode_predictions
import numpy as np

# 加载VGG16模型（不包含顶部的全连接层）
model = VGG16(weights='imagenet', include_top=False)

# 准备示例图像
img_path = 'example.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

# 使用VGG16模型进行预测
features = model.predict(x)

# 输出预测结果
print('预测结果：', decode_predictions(features, top=3)[0])