深度学习狗图片_狗品种分类的深度学习

最新推荐文章于 2024-06-20 14:56:28 发布

weixin_26706653

最新推荐文章于 2024-06-20 14:56:28 发布

阅读量1k

点赞数

文章标签：深度学习人工智能机器学习 python tensorflow

原文链接：https://medium.com/swlh/deep-learning-for-dog-breed-classification-77ef182a2509

版权

深度学习狗图片

深度学习 (Deep Learning)

Stuck behind the paywall? Click here to read the full story with my friend link!

卡在收费墙后面？单击此处，与我的朋友链接阅读完整的故事！

According to dogtime.com, there are 266 different breeds of dogs, and by alone thinking about this number, it frightens me to distinguish them. And most of the people, if they’re normal, just know about 5–10 breeds because you don’t see the chapter “266 Different Dog Breeds” in a Bachelor’s Curriculum.

根据dogtime.com的资料，有266种不同的狗，单单思考这个数字，我就难以区分它们。而且大多数人，如果他们是正常人，只知道大约5-10个品种，因为您不会在学士课程中看到“ 266种不同的犬种”一章。

总览 (Overview)

The main aim of this project is to build an algorithm to classify the different Dog Breeds from the dataset.

该项目的主要目的是建立一种算法，以从数据集中对不同的犬种进行分类。

This seems like a simple task but when we think of Machine Learning, then it is not! The Images are in random order, having dogs at random spaces in the images, the images are shot in different lightenings, there is no preprocessing done on the data, it’s just a dataset containing simple dogs pictures.

这似乎是一个简单的任务，但是当我们想到机器学习时，事实并非如此！图像以随机顺序排列，在图像中的随机空间处有狗，图像以不同的亮光拍摄，没有对数据进行任何预处理，而只是一个包含简单狗图像的数据集。

So, the first step is to give the dataset a look.

因此，第一步是给数据集一个外观。

环境与工具 (Environment and tools)

数据 (Data)

The Dataset used for this project is Stanford Dogs Dataset. The Dataset contains a total of 20,580 images of 120 different dog breeds.

该项目使用的数据集是Stanford Dogs数据集。数据集包含120种不同犬种的20580张图像。

The Stanford Dogs dataset contains images of 120 breeds of dogs from around the world. This dataset has been built using images and annotation from ImageNet for the task of fine-grained image categorization.

斯坦福犬数据集包含来自世界各地的120种犬的图像。此数据集是使用ImageNet的图像和注释构建的，用于精细图像分类。

导入库 (Importing Libraries)

import os
import sys
import keras
import tarfile
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as pltfrom keras.models import Sequential
from keras.engine.training import Model
from sklearn.preprocessing import LabelBinarizer
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Add, Dropout, Flatten, Dense, Activation

数据预处理 (Data Preprocessing)

I found 5 directories to be unusable and hence, didn’t used them. So, I imported a total of 115 Breeds.

我发现5个目录不可用，因此没有使用它们。因此，我总共导入了115种。

import cv2BASEPATH = './Images'
LABELS = set()
paths = []for d in os.listdir(BASEPATH):
LABELS.add(d)
paths.append((BASEPATH + '/' + d, d))# resizing and converting to RGB
def load_and_preprocess_image(path):
image = cv2.imread(path)
image = cv2.resize(image, (224, 224))
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
return imageX, y = [], []
i = 0for path, label in paths:
i += 1# Faulty Directories
if i == 18 or i == 23 or i == 41 or i == 49 or i == 90: continue 
if path == "./Images/.DS_Store": continuefor image_path in os.listdir(path):
image = load_and_preprocess_image(path + "/" + image_path)
X.append(image)
y.append(label)

Now, the names of the folder are in this pattern ‘n8725563753-Husky’, hence, we need to clean this up to be left with the ‘Husky’ part of the name.

现在，该文件夹的名称采用此模式'n8725563753-Husky' ，因此，我们需要清理该文件夹，以保留名称中的'Husky'部分。

Y = []# Cleaning the names of the directories/targets
for i in y:
Y.append(i.split('-')[1])

标签二值化器 (Label Binarizer)

This dependency is from sklearn.preprocessing and is used to get a binary representation of strings. Why are we using this here? We can’t use ‘Husky’ as the target in a model, we need to convert it into a usable data type, numeric. Hence, we use this.

此依赖项来自sklearn.preprocessing ，用于获取字符串的二进制表示形式。 我们为什么在这里使用它？ 我们无法将“ Husky”用作模型中的目标，我们需要将其转换为可用的数据类型numeric 。因此，我们使用它。

encoder = LabelBinarizer()y = encoder.fit_transform(np.array(y))

分割数据 (Splitting Data)

We are using the train_test_split dependency from sklearn.model_selection.

我们正在使用sklearn.model_selection中的train_test_split依赖项。

train_test_split is a function in Sklearn model selection for splitting data arrays into two subsets: for training data and for testing data. With this function, you don't need to divide the dataset manually.

train_test_split是Sklearn模型选择中的一个函数，用于将数据数组分为两个子集 ：用于训练数据和用于测试数据。使用此功能，您无需手动划分数据集。

By default, Sklearn train_test_split will make random partitions for the two subsets. However, you can also specify a random state for the operation.

默认情况下，Sklearn train_test_split将对这两个子集进行随机分区。但是，您也可以为操作指定随机状态。

from sklearn.model_selection import train_test_splitX = np.array(X)x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.25, random_state=87)

Now, after this, we convert the x_train and x_test sets to ‘float32’ and normalize them.

现在，在此之后，我们将x_train和x_test设置转换为“ float32 ”并将其标准化。

x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

最初查看数据 (Viewing Data initially)

These are the pictures with which we’ll be making our model learn.

这些是我们将用来使模型学习的图片。

转移学习 (Transfer Learning)

Now, Transfer Learning can be a full topic to be explained on its own, but I’ll just scratch the tip of the iceberg here.

现在，“转移学习”可以是一个完整的话题，可以单独解释，但是在这里，我只涉及冰山一角。

Transfer learning is a machine learning technique where a model trained on one task is re-purposed on a second related task.

转移学习是一种机器学习技术，其中将在一个任务上训练的模型重新用于第二个相关任务。

Why we use Transfer Learning? You don’t want to train a model with millions of nodes, again and again, to use in your projects, hence why you have this concept. The concept of Transfer Learning is that you use a pre-trained model and just retrain some of the layers to adapt it to your requirements.

为什么我们要使用转学？ 您不想一次又一次地训练具有数百万个节点的模型以在您的项目中使用，因此为什么会有这个概念。转移学习的概念是您使用预先训练的模型，而只是重新训练一些层次以使其适应您的需求。

from keras.applications import inception_v3input_size = 224
num_classes = 115inception_bottleneck = inception_v3.InceptionV3(weights='imagenet', include_top=False, pooling='avg')temp_train = inception_bottleneck.predict(x_train, batch_size=32, verbose=1)
temp_test = inception_bottleneck.predict(x_test, batch_size=32, verbose=1)print('InceptionV3 train bottleneck features shape: {} size: {:,}'.format(temp_train.shape, temp_train.size))
print('InceptionV3 test bottleneck features shape: {} size: {:,}'.format(temp_test.shape, temp_test.size))

We set include_top parameter is set to False, which means that we would not import the last layer, Dense layer and we’d use our own layers to adapt the model to our Dataset.

我们设置include_top 参数设置为False，这意味着我们将不导入最后一层Dense层，而将使用我们自己的层使模型适应数据集。

致密层 (Dense Layers)

After this, we add 3 Dense Layers to the model of depths 1024, 512, and 115, number of classes.

此后，我们将3个密集层添加到深度为1024、512和115的类数模型中。

model = Sequential()model.add(Flatten())model.add(Dense(1024, activation='elu'))
model.add(Dropout(0.45))
model.add(Dense(512, activation='elu'))
model.add(Dropout(0.35))model.add(Dense(num_classes, activation='softmax'))

Then, we compile this.

然后，我们对此进行编译。

model.compile(optimizer=’adam’,
loss=’categorical_crossentropy’, metrics=[‘accuracy’])

Then, finally train.

然后，最后训练。

history = model.fit(temp_train, Y_train,
epochs = 15,
batch_size = 32,
validation_data = (temp_test, Y_test))

Do you see? We used temp_train and temp_test here instead of x_train and x_test . This is because we wanted to extend our Inception model not to use this Sequential Model to start training from scratch.

你有看到？ 我们使用了temp_train 和temp_test 在这里而不是x_train 和x_test 。这是因为我们希望扩展Inception模型，而不是使用此顺序模型从头开始训练。

损失图 (Loss plots)

score = model.evaluate(temp_test, Y_test, verbose=0)
print("%s: %.2f%%" % (model.metrics_names[1], score[1]*100))# summarize history for accuracyplt.subplot(211)
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')# summarize history for lossplt.subplot(212)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')plt.subplots_adjust(right=3, top=3)plt.show()

结果与结论 (Results & Conclusion)

So, after all this, we reached 77.31% accuracy and I’ll be honest, considering the fact that there were 115 different classes, the model did a pretty good job.

因此，毕竟，我们达到了77.31％的准确度，说实话，考虑到存在115个不同的类，该模型做得很好。

可视化结果 (Visualizing results)

for i in range(9):
pyplot.subplot(330 + 1 + i)
pyplot.xlabel("Actual: " + y_test[i] + ", Predicted: " + results[i])
pyplot.imshow(x_test[i], cmap=pyplot.get_cmap('gray'))plt.subplots_adjust(right=3, top=3)pyplot.show()