深度学习图像分类_第24天：如何为《权力的游戏》龙建立深度学习图像分类器...-CSDN博客

深度学习图像分类

by Harini Janakiraman

通过哈里尼·贾纳基拉曼

第24天：如何为《权力的游戏》龙建立深度学习图像分类器 (Day 24: How to build a Deep Learning Image Classifier for Game of Thrones dragons)

Performance of most flavors of the old generations of learning algorithms will plateau. Deep learning, training large neural networks, is scalable and performance keeps getting better as you feed them more data. — Andrew Ng

大多数旧版本学习算法的性能将达到平稳状态。深度学习可训练大型神经网络，具有可扩展性，并且当您向它们提供更多数据时，性能会不断提高。－ 吴彦祖

Deep learning doesn’t take a huge amount of time or computational resources. Nor does it require highly complex code, and in some cases not even a large amount of training data. Curated best practices are now available as libraries that make it easy to plug in and write your own neural network architectures using a minimal amount of code to achieve more than 90% prediction accuracies.

深度学习不需要大量的时间或计算资源。它也不需要非常复杂的代码，在某些情况下甚至不需要大量的培训数据。现在可以使用库中提供的精选最佳实践，使您可以使用最少的代码轻松插入和编写自己的神经网络体系结构，以实现90％以上的预测精度。

The two most popular deep learning libraries are: (1) pytorch created by Facebook (we will be using fastai today, which is built on top of pytorch) and (2) the keras-tensorflow framework created by Google.

两种最受欢迎的深度学习库是：(1)Facebook创建的pytorch(今天我们将使用在pytorch之上构建的fastai)和(2)Google创建的keras-tensorflow框架。

该项目 (The Project)

We will build an image classifier using the Convolutional Neural Network (CNN) model to predict if a given image is that of Drogon or Vicerion (any Game of Thrones fans here in the house? Clap to say yay!).

我们将使用卷积神经网络(CNN)模型构建图像分类器，以预测给定的图像是否是Drogon或Vicerion(房子中的《权力游戏》粉丝吗？拍手说是！)。

You can adapt this problem statement to any type of image classification that interests you. Here are some ideas: cat or dog (classic deep learning 101), if a person is wearing glasses or not, bus or car, hot dog vs not-hot dog (Silicon Valley fans also say yay! ;) ).

您可以使此问题陈述适合您感兴趣的任何类型的图像分类。这里有一些想法：猫或狗(经典的深度学习101)，如果一个人戴着或不戴着眼镜，公共汽车或汽车，热狗与不热狗(硅谷粉丝也说yay！;))。

步骤1：安装 (Step 1: Installation)

You can use any GPU accelerated cloud computing platform for running your model on. For the purpose of this blog we will be using Paperspace (most affordable). Complete instructions on how to get this up and running are available here.

您可以使用任何GPU加速的云计算平台在其上运行模型。就本博客而言，我们将使用Paperspace (价格最实惠)。有关如何启动和运行它的完整说明，请参见此处。

Once setup, you can launch Jupyter notebook on that machine using the following command:

设置完成后，您可以使用以下命令在该计算机上启动Jupyter Notebook：

jupyter notebook

This will give you a localhost URL that you can open in your browser and replace “localhost” with your machine’s IP address to launch your notebook.

这将为您提供一个本地主机URL，您可以在浏览器中打开它，并将“本地主机”替换为计算机的IP地址以启动笔记本。

Now you can copy over the iPython notebook and dataset files into the directory structure below from my github repo.

现在，您可以从github存储库中将iPython笔记本和数据集文件复制到下面的目录结构中。

Note: Do not forget to shut down the machine from the paperspace console once you are done to avoid getting accidentally charged.

注意：完成操作后，请勿忘记从Paperspace控制台关闭机器，以免意外充电。

步骤2：训练 (Step 2: Training)

Follow the instructions in the notebook to initialize the libraries needed for this exercise, and point to the location of the PATH to your data directory. Note that each block of code can be run using “shift+enter.” In case you need additional info on Jupyter notebook commands, you can read more here.

按照笔记本中的说明初始化此练习所需的库，并指向数据目录的PATH位置。请注意，每个代码块都可以使用“ shift + enter”运行。如果您需要有关Jupyter笔记本电脑命令的其他信息，可以在此处阅读更多信息。

Now, coming to the part of training the image classifier, the following three lines of code form the core of building the deep learning model:

现在，在训练图像分类器的部分，以下三行代码构成了构建深度学习模型的核心：

data: represents the validation and training datasets.
data ：表示验证和训练数据集。
learn: contains the model
学习：包含模型
learn.fit(learning_rate,epoch): Fit the model using two parameters — learning rate and epochs.
learning.fit(learning_rate，epoch) ：使用两个参数(学习率和时期)拟合模型。

We have set the learning rate to be “0.01” here. Learning rate needs to be a small enough number so that you move through the image in incremental steps of this factor to learn with accuracy. But it shouldn’t be too small, either, as that would result in too many steps/too long to learn. The library has a learning rate finder method “lr_find()” to find the optimal one.

在这里，我们将学习率设置为“ 0.01”。学习率必须足够小，这样您才能以该因子为增量步长浏览图像，从而准确地学习。但这也不应该太小，因为那样会导致太多的步骤/太长的学习时间。该库具有学习率查找器方法“ lr_find()”以找到最佳方法。

Epoch is set to “3” in the code here and it represents how many times you should run the batch. We can run as many times as we want, but after a point accuracy will start to get worse due to overfitting.

在此处的代码中，Epoch设置为“ 3”，它代表您应该运行该批处理的次数。我们可以根据需要运行任意次，但是由于过拟合，点精度将开始变差。

步骤3：预测 (Step 3: Prediction)

We will now run prediction on the validation data using the trained model.

现在，我们将使用经过训练的模型对验证数据进行预测。

Pytorch gives a log of prediction, so to get the probability you have to get e to the power of using numpy. Follow the instructions step by step on the notebook in my github repo. A probability close to 0 implies its an image of Drogon and a probability close to 1 implies its an image of Viserion.

Pytorch提供了一个预测日志，因此要获得概率，您必须获得e才能使用numpy。在我的github存储库中按照笔记本上的说明逐步操作。接近0的概率表示其Drogon图像，接近1的概率表示其Viserion图像。

步骤4：可视化 (Step 4: Visualize)

Plotting function can be used to visualize the results of the prediction better. The below images show you correctly classified validation data with 0.2–0.3 indicating it’s Drogon and a probablity of 0.7–0.8 indicating it’s Viserion.

绘图功能可用于更好地可视化预测结果。下图显示了正确分类的验证数据，其中0.2-0.3表示其Drogon，概率0.7-0.8表示其Viserion。

You can also see some of the uncertain predictions if they linger closer to 0.5 probability.

您还可以看到一些不确定的预测，如果它们徘徊在0.5个概率附近。

The image classifier in some scenarios can have uncertain predictions, for example in case of long tailed images, as it grabs a small piece of the square at a time.

在某些情况下，图像分类器可能具有不确定的预测，例如在长尾图像的情况下，因为它一次只能抓住一小块正方形。

In those cases, enhancement techniques can be done to have better results such as data augmentation, optimizing the learning rate, using differential learning rates for different layers, and test-time augmentation. These advanced concepts will be explored in future posts.

在那些情况下，可以采用增强技术来获得更好的结果，例如数据增强，优化学习率，对不同层使用不同的学习率以及测试时间增强。这些高级概念将在以后的文章中进行探讨。

This blog was inspired by fastai CNN video. To get an in-depth understanding and continue your quest in Deep Learning, you can take the famous set of courses by Andrew Ng on coursera.

此博客的灵感来自fastai CNN视频。为了获得深入的理解并继续进行深度学习，您可以参加Coursera上Andrew Ng着名的课程。

If you enjoyed this, please clap ? so others can see it as well! Follow me on Twitter @HariniLabs or Medium to get new post updates or to just say Hi :)

如果喜欢这个，请鼓掌吗？ S 0其他人可以看到它的！ 在Twitter上关注我@ H ariniLabs或Medium，以获取新帖子更新或只打个招呼:)

PS: Sign up for my newsletter here to be the first to get fresh new content and it’s filled with doses of inspiration from the world of #WomenInTech — and yes men can signup too :)

PS： 在这里注册我的新闻通讯是第一个获得新鲜内容的新闻，它充满了＃WomenInTech的灵感-是的，男性也可以注册:)