图像分类数据库
This article is a tutorial on how to use the Monk library to classify house room types like the living room, dining room, etc.
本文是有关如何使用Monk库对房屋房间类型(例如客厅,饭厅等)进行分类的教程。
A detailed tutorial is available on GitHub.
GitHub上有详细的教程。
关于数据集 (About the dataset)
- The dataset contains a total of 145436 images divided into seven classes namely, ‘Exterior’, ‘bedroom’, ‘kitchen’, ‘living_room’, ‘Interior’, ‘bathroom’, ‘dining_room’. 数据集包含总共145436张图像,分为7类,即“外部”,“卧室”,“厨房”,“客厅”,“内部”,“浴室”,“餐厅”。
It can be downloaded from here.
可以从这里下载。
关于僧侣图书馆和僧侣如何使图像分类容易 (About Monk Library and How does Monk make image classification easy)
- Write less code and create an end to end applications. 编写更少的代码并创建端到端应用程序。
- Learn only one syntax and create applications using any deep learning library — PyTorch, Mxnet, Keras, TensorFlow, etc. 使用任何深度学习库(PyTorch,Mxnet,Keras,TensorFlow等)仅学习一种语法并创建应用程序。
- Manage your entire project easily with multiple experiments. 通过多个实验轻松管理整个项目。
该库是为谁而建的 (For whom this library is built)
- Students: Seamlessly learn computer vision using our comprehensive study road-maps. 学生:使用我们全面的学习路线图无缝学习计算机视觉。
- Researchers and Developers: Create and Manage multiple deep learning projects. 研究人员和开发人员:创建和管理多个深度学习项目。
- Competition participants on Kaggle, Codalab, HackerEarth, AiCrowd, etc. Kaggle,Codalab,HackerEarth,AiCrowd等竞赛参与者。
目录 (Table of contents)
- Install Monk 安装和尚
2. Using the pre-trained model for part of house classification dataset
2.将预训练模型用于房屋分类数据集的一部分
3. Training a classifier from scratch
3.从头开始训练分类器
Train a part of house classifier using ResNet variants
使用ResNet变体训练房屋分类器的一部分
Understand what all differences happen when switching between ResNets variants.
了解在ResNets变体之间切换时发生的所有差异。
Understand a bigger and deeper network not always means better results
了解更大更深的网络并不总是意味着更好的结果
For this experiment, you will be using mxnet backend
对于本实验,您将使用mxnet后端
4. Conclusions.
4。结论。
1.安装和尚 (1. Installing Monk)
#using cuda10.0 download monk using following commandpip install -U monk-gluon-cuda100
For more ways to install visit Monk Library.
有关更多安装方式的信息,请访问Monk Library 。
2.使用预训练的模型进行演示 (2. Using a pre-trained model for the demo)
First, download the pre-trained models from google drive.
首先,从Google Drive下载预训练的模型。
! wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=10SrowcOJp8GWqEB21BfCIinqUCHS7PMv' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=10SrowcOJp8GWqEB21BfCIinqUCHS7PMv" -O cls_house_scene_trained.zip && rm -rf /tmp/cookies.txt
The above will download the zip file of pre-trained models and name them as cls_house_scene_trained.zip. Now we have to unzip the file.
上面的代码将下载预训练模型的zip文件,并将其命名为cls_house_scene_trained.zip。 现在我们必须解压缩文件。
! unzip -qq cls_house_scene_trained.zip
Importing monk.gluon_prototype for working with the monk library.
导入monk.gluon_prototype 与和尚图书馆一起工作。
#Using mxnet-gluon backend# When installed using pipfrom monk.gluon_prototype import prototype
Let’s load the model in infer mode to classify the demo data.
让我们以推断模式加载模型以对演示数据进行分类。
# Load project in inference modegtf = prototype(verbose=1);
gtf.Prototype("Task", "gluon_resnet18_v1_train_all_layers", eval_infer=True);
推断一些样品 (Infer on some samples)
img_name = "workspace/test/2.jpg"
predictions = gtf.Infer(img_name=img_name);
from IPython.display import Image
Image(filename=img_name, height=300, width=300)
img_name = "workspace/test/3.jpg"
predictions = gtf.Infer(img_name=img_name);
from IPython.display import Image
Image(filename=img_name, height=300, width=300)
img_name = "workspace/test/6.jpg"
predictions = gtf.Infer(img_name=img_name);
from IPython.display import Image
Image(filename=img_name, height=300, width=300)
For more examples visit the notebook.
有关更多示例,请访问笔记本 。
3. 从头开始训练自定义分类器 (3. Training custom classifier from scratch)
什么是ResNet? (What is ResNet?)
Points from https://towardsdatascience.com/an-overview-of-resnet-and-its-variants-5281e2f56035
来自https://towardsdatascience.com/an-overview-of-resnet-and-its-variants-5281e2f56035的点
- The core idea of ResNet is introducing a so-called “identity shortcut connection” that skips one or more layers
- The deeper model should not produce a training error higher than its shallower counterparts.
- solves the problem of vanishing gradients as network depth increased
即将到来的内容 (Upcoming contents)
- Load the Dataset 加载数据集
- Train model using resnet18_v1(training only last layer) architecture for transfer learning. 使用resnet18_v1(仅训练最后一层)体系结构训练模型以进行迁移学习。
- Train model using resnet18_v1(training all the layers) architecture for transfer learning. 使用resnet18_v1(训练所有层)体系结构训练模型以进行迁移学习。
- Train model using resnet18_v2(training only last layer) architecture for transfer learning. 使用resnet18_v2(仅训练最后一层)体系结构训练模型以进行迁移学习。
- Train model using resnet34_v1(training only last layer) architecture for transfer learning. 使用resnet34_v1(仅训练最后一层)体系结构训练模型以进行迁移学习。
- Compare all the models. 比较所有模型。
1.加载数据集 (1. Loading the Dataset)
Dataset Credits: https://omidpoursaeed.github.io/publication/vision-based-real-estate-price-estimation/
数据集积分: https : //omidpoursaeed.github.io/publication/vision-based-real-estate-price-estimation/
Step-1: Downloading the dataset.
步骤1:下载数据集。
! wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=0BxDIywue_VABY1dRcFVvZ3BodnM' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=0BxDIywue_VABY1dRcFVvZ3BodnM" -O dataset.zip && rm -rf /tmp/cookies.txt
Step-2: Unzipping the dataset.
步骤2:解压缩数据集。
! unzip -qq dataset.zip
2.使用resnet18_v1(仅训练最后一层)体系结构训练模型以进行迁移学习。 (2. Train model using resnet18_v1(training only last layer) architecture for transfer learning.)
Step-1: Load experiment and insert data. For more details visit.
步骤1:加载实验并插入数据。 有关更多详细信息, 请访问 。
# Load experimentmonk_gln = prototype(verbose=1);
monk_gln.Prototype("Task", "gluon_resnet18_v1");# Insert data and set params in default modemonk_gln.Default(dataset_path="Train",
model_name="resnet18_v1",
freeze_base_network=True,
num_epochs=10);
Step-2: Training the model.
步骤2:训练模型。
#Start Trainingmonk_gln.Train();
3.使用resnet18_v1(训练所有层)体系结构训练模型以进行迁移学习。 (3. Train model using resnet18_v1(training all the layers) architecture for transfer learning.)
Step-1: Load experiment and insert data. For more details visit. Parameter freeze_base_network will be set to false to train all the layers.
步骤1:加载实验并插入数据。 有关更多详细信息, 请访问 。 参数freeze_base_network 将设置为false以训练所有图层。
# Load experimentmonk_gln = prototype(verbose=1);
monk_gln.Prototype("Task", "gluon_resnet18_v1_train_all_layers");# Insert data and set params in default modemonk_gln.Default(dataset_path="Train",
model_name="resnet18_v1",
freeze_base_network=False,
num_epochs=10);
Step-2: Training the model.
步骤2:训练模型。
#Start Trainingmonk_gln.Train();
4.使用resnet18_v2(仅训练最后一层)体系结构训练模型以进行迁移学习。 (4. Train model using resnet18_v2(training only last layer) architecture for transfer learning.)
Step-1: Load experiment and insert data. For more details visit.
步骤1:加载实验并插入数据。 有关更多详细信息, 请访问 。
# Load experiment
monk_gln = prototype(verbose=1);
monk_gln.Prototype("Task", "gluon-resnet18_v2");# Insert data and set params in default modemonk_gln.Default(dataset_path="Train",
model_name="resnet18_v2",
freeze_base_network=True,
num_epochs=10);
Step-2: Training the model.
步骤2:训练模型。
#Start Trainingmonk_gln.Train()
5.使用resnet34_v1(仅训练最后一层) 体系结构来训练模型以 进行迁移学习。 (5. Train model using resnet34_v1(training only last layer) architecture for transfer learning.)
Step-1: Load experiment and insert data. For more details visit.
步骤1:加载实验并插入数据。 有关更多详细信息, 请访问 。
# Load experiment
monk_gln = prototype(verbose=1);
monk_gln.Prototype("Task", "gluon-resnet18_v2");# Insert data and set params in default modemonk_gln.Default(dataset_path="Train",
model_name="resnet34_v1",
freeze_base_network=True,
num_epochs=10);
Step-2: Training the model.
步骤2:训练模型。
#Start Trainingmonk_gln.Train()
6.比较所有模型。 (6. Comparing all the models.)
Step-1: Using comparison class of Monk Library. For more details visit.
步骤1:使用僧侣库的比较类。 有关更多详细信息, 请访问 。
# Invoke the comparison classfrom monk.compare_prototype import compare
Step-2: Creating and managing comparison experiments. Provide the project name.
步骤2:创建和管理比较实验。 提供项目名称。
gtf = compare(verbose=1);
gtf.Comparison("Comparison-1");
This creates files and directories as per the following structure
这将按照以下结构创建文件和目录
workspace
|
|--------comparison
|
|
|-----Compare-1
|
|------stats_best_val_acc.png
|------stats_max_gpu_usage.png
|------stats_training_time.png
|------train_accuracy.png
|------train_loss.png
|------val_accuracy.png
|------val_loss.png
|-----comparison.csv (Contains necessary details of all experiments)
Step-3: Adding the experiments to the comparison object.
步骤3:将实验添加到比较对象。
- The first argument - Project name. 第一个参数-项目名称。
- The second argument - Experiment name. 第二个参数-实验名称。
#add the models trained above.
gtf.Add_Experiment("Task", "gluon_resnet18_v1");
gtf.Add_Experiment("Task", "gluon_resnet18_v1_train_all_layers");
gtf.Add_Experiment("Task", "gluon-resnet18_v2");
gtf.Add_Experiment("Task", "gluon-resnet34_v1");
Step-4: Run analysis. This step generates seven images.
步骤4:运行分析。 此步骤将生成七个图像。
gtf.Generate_Statistics();
Step-5: Visualize the study comparison metrics.
步骤5:可视化研究比较指标。
- Training Accuracy curve 训练精度曲线
from IPython.display import Image
Image(filename="workspace/comparison/Comparison-1/train_accuracy.png")
- Training loss curve 训练损失曲线
Image(filename="workspace/comparison/Comparison-1/train_loss.png")
- Validation accuracy curve 验证精度曲线
Image(filename="workspace/comparison/Comparison-1/val_accuracy.png")
- Validation loss curve 验证损失曲线
Image(filename="workspace/comparison/Comparison-1/val_loss.png")
- Training time is taken 花费训练时间
Image(filename="workspace/comparison/Comparison-1/stats_training_time.png")
- Max GPU usage 最大GPU使用率
Image(filename="workspace/comparison/Comparison-1/stats_max_gpu_usage.png")
- Best validation accuracy 最佳验证精度
Image(filename="workspace/comparison/Comparison-1/stats_best_val_acc.png")
结论 (Conclusions)
- Monk library makes it very easy for students, researchers, and competitors to create deep learning models and try different hyper-parameter tuning to increase the accuracy of the model. 僧侣库使学生,研究人员和竞争者很容易创建深度学习模型,并尝试不同的超参数调整以提高模型的准确性。
- The pre-trained model can be downloaded used directly without getting into the model creation part. 可以直接下载使用预训练的模型,而无需进入模型创建部分。
- From the above graphs, we can see that in this dataset model in which all the layers are trainable gives more accuracy than the rest of the models. But this is not always true so till advisable to try both. 从上面的图表中,我们可以看到,在所有层都可训练的数据集模型中,该模型比其他模型具有更高的准确性。 但这并非总是如此,因此建议您同时尝试两者。
- Also, the time taken when all the layers are trainable is around 40% more when compared with the model in which only the last layer is trainable. 而且,与仅最后一层是可训练的模型相比,所有层都可训练的时间要多40%左右。
谢谢阅读。 请提出您的建议。 (Thanks for reading. Please give your suggestions.)
Rohit is a BTech. final year student. He has knowledge of machine learning and deep learning. He is interested to work in the field of AI and ML. Currently, he is working as a computer vision intern at Tessellate Imaging. Connect with him on LinkedIn.
Rohit是一名BTech。 最后一年的学生。 他具有机器学习和深度学习的知识。 他对从事AI和ML领域感兴趣。 目前,他在Tessellate Imaging担任计算机视觉实习生。 在LinkedIn上与他联系。
图像分类数据库