深度学习 场景识别
Recognizing the environment in one glance is one of the human brain’s most accomplished deeds. While the tremendous recent progress in object recognition tasks originates from the availability of large datasets such as COCO and the rise of Convolution Neural Networks ( CNNs) to learn high-level features, scene recognition performance has not achieved the same level of success.
一眼认清环境是人类大脑最成就的事迹之一。 尽管最近在对象识别任务中取得的巨大进步源于大型数据集(例如COCO)的可用性以及卷积神经网络(CNN)的兴起,以学习高级功能,但场景识别性能并未达到相同的成功水平。
In this blog post, we will see how classification models perform on classifying images of a scene. For this task, we have taken the Places365-Standard dataset to train the model. This dataset has 1,803,460 training images and 365 classes with the image number per class varying from 3,068 to 5,000 and size of images is 256*256.
在此博客文章中,我们将看到分类模型如何对场景图像进行分类。 为此,我们采用了Places365-Standard数据集来训练模型。 该数据集包含1,803,460个训练图像和365个类别,每个类别的图像编号从3,068到5,000不等,图像大小为256 * 256。
安装和下载数据 (Installing and Downloading the data)
Let’s start by setting up Monk and its dependencies:
让我们开始设置Monk及其依赖项:
!git clone https://github.com/Tessellate-Imaging/monk_v1.git! cd monk_v1/installation/Linux && pip install -r requirements_cu9.txt
After installing the dependencies, I downloaded the Places365-Standard dataset which is available to downl