网页怎么预先加载模型_4种经过预先训练的cnn模型，可用于带有转移学习的计算机视觉...

最新推荐文章于 2022-12-13 17:00:53 发布

weixin_26712095

最新推荐文章于 2022-12-13 17:00:53 发布

阅读量708

点赞数

文章标签：深度学习计算机视觉 tensorflow 人工智能机器学习

原文链接：https://towardsdatascience.com/4-pre-trained-cnn-models-to-use-for-computer-vision-with-transfer-learning-885cb1b2dfc

版权

本文介绍了如何在网页应用中预先加载经过训练的卷积神经网络（CNN）模型，以用于计算机视觉任务。通过转移学习，这些模型可以极大地提升图像识别和处理的性能。

摘要由CSDN通过智能技术生成

网页怎么预先加载模型

Before we start, if you are reading this article, I am sure that we share similar interests and are/will be in similar industries. So let’s connect via Linkedin! Please do not hesitate to send a contact request! Orhan G. Yalçın — Linkedin

在开始之前，如果您正在阅读本文，我相信我们有相同的兴趣并且会/将会从事相似的行业。 因此，让我们通过Linkedin连接！ 请不要犹豫，发送联系请求！ Orhan G.Yalçın— Linkedin

如果您一直在尝试构建高精度的机器学习模型；但从未尝试过转学，本文将改变您的生活。至少，它确实是我的！ (If you have been trying to build machine learning models with high accuracy; but never tried Transfer Learning, this article will change your life. At least, it did mine!)

Most of us have already tried several machine learning tutorials to grasp the basics of neural networks. These tutorials were very helpful to understand the basics of artificial neural networks such as Recurrent Neural Networks, Convolutional Neural Networks, GANs, and Autoencoders. But, their main functionality was to prepare you for real-world implementations.

我们大多数人已经尝试了一些机器学习教程来掌握神经网络的基础知识。这些教程对理解诸如递归神经网络，卷积神经网络， GAN和自动编码器等人工神经网络的基础很有帮助。但是，它们的主要功能是为实际实施做好准备。

Now, if you are planning to build an AI system that utilizes deep learning, you either (i) have to have a very large budget for training and excellent AI researchers at your disposal or (ii) have to benefit from transfer learning.

现在，如果您打算构建一个利用深度学习的AI系统，则您要么(i)必须有非常大的培训预算和可支配的优秀AI研究人员，要么(ii)必须从迁移学习中受益。

什么是转学？ (What is Transfer Learning?)

Transfer learning is a subfield of machine learning and artificial intelligence which aims to apply the knowledge gained from one task (source task) to a different but similar task (target task).

转移学习是机器学习和人工智能的一个子领域，旨在将从一个任务(源任务)获得的知识应用于另一个但相似的任务(目标任务)。

For example, the knowledge gained while learning to classify Wikipedia texts can be used to tackle legal text classification problems. Another example would be using the knowledge gained while learning to classify cars to recognize the birds in the sky. As you can see there is a relation between these examples. We are not using a text classification model on bird detection.

例如，在学习分类维基百科文本时获得的知识可用于解决法律文本分类问题。另一个例子是利用在学习对汽车进行分类时获得的知识来识别天空中的鸟类。如您所见，这些示例之间存在关联。我们没有在鸟类检测中使用文本分类模型。

Transfer learning is the improvement of learning in a new task through the transfer of knowledge from a related task that has already been learned.

转移学习是通过从已经学习的相关任务中转移知识来改进新任务中学习的方法。

In summary, transfer learning is a field that saves you from having to reinvent the wheel and helps you build AI applications in a very short amount of time.

总而言之，转移学习是一个使您免于繁琐工作的领域，可帮助您在非常短的时间内构建AI应用程序。

Image for post — Jon Cartagena on Jon Cartagena的照片， Unsplash) Unsplash拍摄)

转移学习的历史 (History of Transfer Learning)

To show the power of transfer learning, we can quote from Andrew Ng:

为了展示转学的力量，我们可以引用吴恩达的话：

Transfer learning will be the next driver of machine learning’s commercial success after supervised learning.

在监督学习之后，转移学习将成为机器学习商业成功的下一个驱动力。

The history of Transfer Learning dates back to 1993. With her paper, Discriminability-Based Transfer between Neural Networks, Lorien Pratt opened the pandora’s box and introduced the world with the potential of transfer learning. In July 1997, the journal Machine Learning published a special issue for transfer learning papers. As the field advanced, adjacent topics such as multi-task learning were also included under the field of transfer learning. Learning to Learn is one of the pioneer books in this field. Today, transfer learning is a powerful source for tech entrepreneurs to build new AI solutions and researchers to push the frontiers of machine learning.

转移学习的历史可以追溯到1993年。洛里恩·普拉特(Lorien Pratt)在她的论文《基于可辨别性的神经网络之间转移》中打开了潘多拉盒子，向世界介绍了转移学习的潜力。 1997年7月，《机器学习》杂志发表了关于转学论文的特刊。随着该领域的发展，相邻的主题(例如多任务学习)也包括在迁移学习领域下。学习学习是该领域的先锋书籍之一。如今，转移学习已成为技术企业家构建新的AI解决方案和研究人员推动机器学习前沿的强大资源。

转移学习如何工作？ (How Does Transfer Learning Work?)

There are three requirements to achieve transfer learning:

实现转移学习有三个要求：

Development of an Open Source Pre-trained Model by a Third Party
第三方开发的开源预训练模型
Repurposing the Model
重新利用模型
Fine Tuning for the Problem
问题的微调

开源预训练模型的开发(Development of an Open Source Pre-trained Model)

A pre-trained model is a model created and trained by someone else to solve a problem that is similar to ours. In practice, someone is almost always a tech giant or a group of star researchers. They usually choose a very large dataset as their base datasets such as ImageNet or the Wikipedia Corpus. Then, they create a large neural network (e.g., VGG19 has 143,667,240 parameters) to solve a particular problem (e.g., this problem is image classification for VGG19). Of course, this pre-trained model must be made public so that we can take these models and repurpose them.

预训练模型是由其他人创建和训练的模型，用于解决与我们相似的问题。实际上，有人几乎总是科技巨头或一群明星研究者。他们通常选择一个非常大的数据集作为基础数据集，例如ImageNet或Wikipedia Corpus 。然后，他们创建一个大型神经网络(例如VGG19具有143,667,240个参数)来解决特定问题(例如，此问题是VGG19的图像分类)。当然，必须将这种经过预先训练的模型公开，以便我们可以采用这些模型并重新利用它们。

重新利用模型 (Repurposing the Model)

After getting our hands on these pre-trained models, we repurpose the learned knowledge, which includes the layers, features, weights, and biases. There are several ways to load a pre-trained model into our environment. In the end, it is just a file/folder which contains the relevant information. However, deep learning libraries already host many of these pre-trained models, which makes them more accessible and convenient:

在掌握了这些经过预训练的模型之后，我们重新调整了所学知识的用途，其中包括层次，特征，权重和偏差。有几种方法可以将预训练的模型加载到我们的环境中。最后，它只是一个包含相关信息的文件/文件夹。但是，深度学习库已经托管了许多这些预训练的模型，这使它们更易于访问和方便：

You can use one of the sources above to load a trained model. It will usually come with all the layers and weights and you can edit the network as you wish.

您可以使用上面的资源之一来加载经过训练的模型。它通常会附带所有图层和权重，您可以根据需要编辑网络。

问题的微调 (Fine-Tuning for the Problem)

Well, while the current model may work for our problem. It is often better to fine-tune the pre-trained model for two reasons:

好吧，尽管当前模型可能适用于我们的问题。通常，由于以下两个原因，最好对预训练模型进行微调：

So that we can achieve even higher accuracy;
这样我们可以达到更高的精度；
Our fine-tuned model can generate the output in the correct format.
我们经过微调的模型可以生成正确格式的输出。

Generally speaking, in a neural network, while the bottom and mid-level layers usually represent general features, the top layers represent the problem-specific features. Since our new problem is different than the original problem, we tend to drop the top layers. By adding layers specific to our problems, we can achieve higher accuracy.

一般而言，在神经网络中，底层和中层通常代表一般特征，而顶层代表特定于问题的特征。由于我们的新问题与原始问题不同，因此我们倾向于删除顶层。通过添加特定于我们问题的图层，我们可以获得更高的精度。

After dropping the top layers, we need to place our own layers so that we can get the output we want. For example, a model trained with ImageNet can classify up to 1000 objects. If we are trying to classify handwritten digits (e.g., MNIST classification), it may be better to end up with a final layer with only 10 neurons.

删除顶层后，我们需要放置自己的层，以便获得所需的输出。例如，使用ImageNet训练的模型最多可以分类1000个对象。如果我们尝试对手写数字进行分类(例如MNIST分类)，最好以仅包含10个神经元的最后一层结束。

After we add our custom layers to the pre-trained model, we can configure it with special loss functions and optimizers and fine-tune with extra training.

将自定义图层添加到预训练模型后，我们可以使用特殊的损失函数和优化器对其进行配置，并通过额外的训练进行微调。

For a quick Transfer Learning tutorial, you may visit the post below:

有关快速的转移学习教程，您可以访问以下文章：

4种针对计算机视觉的预训练模型 (4 Pre-Trained Models for Computer Vision)

Here are the four pre-trained networks you can use for computer vision tasks such as ranging from image generation, neural style transfer, image classification, image captioning, anomaly detection, and so on:

这是可用于计算机视觉任务的四个预先训练的网络，例如图像生成，神经样式转换，图像分类，图像字幕，异常检测等：

VGG19
VGG19
Inceptionv3 (GoogLeNet)
Inceptionv3(GoogLeNet)
ResNet50
ResNet50
EfficientNet
高效网

Let’s dive into them one-by-one.

让我们一一介绍。

VGG-19 (VGG-19)

VGG is a convolutional neural network which has a depth of 19 layers. It was build and trained by Karen Simonyan and Andrew Zisserman at the University of Oxford in 2014 and you can access all the information from their paper, Very Deep Convolutional Networks for Large-Scale Image Recognition, which was published in 2015. The VGG-19 network is also trained using more than 1 million images from the ImageNet database. Naturally, you can import the model with the ImageNet trained weights. This pre-trained network can classify up to 1000 objects. The network was trained on 224x224 pixels colored images. Here is brief info about its size and performance:

VGG是具有19层深度的卷积神经网络。它是由牛津大学的Karen Simonyan和Andrew Zisserman于2014年构建和培训的，您可以访问他们的论文《用于大规模图像识别的超深度卷积网络》，该论文于2015年发表。VGG-19还使用ImageNet数据库中的100万张图像对网络进行了训练。自然，您可以使用ImageNet训练的权重导入模型。这个经过预先训练的网络最多可以分类1000个对象。该网络在224x224像素的彩色图像上进行了训练。以下是有关其大小和性能的简要信息：

Size: 549 MB
大小： 549 MB
Top-1: Accuracy: 71.3%
前1名：准确性：71.3％
Top-5: Accuracy: 90.0%
前5名：准确度：90.0％
Number of Parameters: 143,667,240
参数数量： 143,667,240
Depth: 26
深度： 26

Inceptionv3(GoogLeNet) (Inceptionv3 (GoogLeNet))

Inceptionv3 is a convolutional neural network which has a depth of 50 layers. It was build and trained by Google and you can access all the information on the paper, titled “Going deeper with convolutions”. The pre-trained version of Inceptionv3 with the ImageNet weights can classify up to 1000 objects. The image input size of this network was 299x299 pixels, which is larger than the VGG19 network. While VGG19 was the runner up in 2014’s ImageNet competition, Inception was the winner. The brief summary of Inceptionv3 features is as follows:

Inceptionv3是具有50层深度的卷积神经网络。它是由Google构建和培训的，您可以访问论文中标题为“卷积更深”的所有信息。带有ImageNet权重的Inceptionv3的预训练版本可以分类多达1000个对象。该网络的图像输入大小为299x299像素，大于VGG19网络。 VGG19在2014年ImageNet竞赛中获得亚军，而Inception是获胜者。 Inceptionv3功能的简要概述如下：

Size: 92 MB
大小： 92 MB
Top-1: Accuracy: 77.9%
前1名：准确性：77.9％
Top-5: Accuracy: 93.7%
前5名：准确性：93.7％
Number of Parameters: 23,851,784
参数数量： 23,851,784
Depth: 159
深度： 159

ResNet50(残留网络) (ResNet50 (Residual Network))

ResNet50 is a convolutional neural network which has a depth of 50 layers. It was build and trained by Microsoft in 2015 and you can access the model performance results on their paper, titled Deep Residual Learning for Image Recognition. This model is also trained on more than 1 million images from the ImageNet database. Just like VGG-19, it can classify up to 1000 objects and the network was trained on 224x224 pixels colored images. Here is brief info about its size and performance:

ResNet50是具有50层深度的卷积神经网络。它是由Microsoft在2015年构建和培训的，您可以在他们的论文标题为图像识别的深度残差学习上访问模型性能结果。该模型还接受了ImageNet数据库中超过100万张图像的训练。就像VGG-19一样，它最多可以分类1000个对象，并且网络是在224x224像素的彩色图像上进行训练的。以下是有关其大小和性能的简要信息：

Size: 98 MB
大小： 98 MB
Top-1: Accuracy: 74.9%
前1名：准确性：74.9％
Top-5: Accuracy: 92.1%
前5名：准确性：92.1％
Number of Parameters: 25,636,712
参数数量： 25,636,712

If you compare ResNet50 to VGG19, you will see that ResNet50 actually outperforms VGG19 even though it has lower complexity. ResNet50 was improved several times and you also have access to newer versions such as ResNet101, ResNet152, ResNet50V2, ResNet101V2, ResNet152V2.

如果将ResNet50与VGG19进行比较，您会发现ResNet50实际上比VGG19表现更好，尽管它的复杂度较低。 ResNet50进行了多次改进，您还可以访问较新的版本，例如ResNet101 ， ResNet152 ， ResNet50V2 ， ResNet101V2 ， ResNet152V2 。

高效网 (EfficientNet)

EfficientNet is a state-of-the-art convolutional neural network that was trained and released to the public by Google with the paper “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks” in 2019. There are 8 alternative implementations of EfficientNet (B0 to B7) and even the simplest one, EfficientNetB0, is outstanding. With 5.3 million parameters, it achieves a 77.1% Top-1 accuracy performance.

EfficientNet是最先进的卷积神经网络，由Google培训并在2019年发布，论文为“ EfficientNet：对卷积神经网络的模型缩放的重新思考”。EfficientNet有8种替代实现(从B0到B7)，甚至最简单的EfficientNetB0，都非常出色。它具有530万个参数，可实现77.1％的Top-1精度性能。

The brief summary of EfficientNetB0 features is as follows:

EfficientNetB0功能的简要概述如下：

Size: 29 MB
大小： 29 MB
Top-1: Accuracy: 77.1%
前1名：准确性：77.1％
Top-5: Accuracy: 93.3%
前5名：准确度：93.3％
Number of Parameters: ~5,300,000
参数数量： 〜5,300,000
Depth: 159
深度： 159

其他针对计算机视觉问题的预训练模型 (Other Pre-Trained Models for Computer Vision Problems)

We listed the four state-of-the-art award-winning convolutional neural network models. However, there are dozens of other models available for transfer learning. Here is a benchmark analysis of these models, which are all available in Keras Applications.

我们列出了四个屡获殊荣的卷积神经网络模型。但是，还有许多其他模型可用于迁移学习。这是这些模型的基准分析，所有这些模型都可以在Keras Applications中获得。

结论 (Conclusion)

In a world where we have easy access to state-of-the-art neural network models, trying to build your own model with limited resources is like trying to reinvent the wheel. It is pointless.

在我们可以轻松访问最新神经网络模型的世界中，尝试用有限的资源来构建自己的模型就像尝试重新发明轮子。这是没有意义的。

Instead try to work with these train models, add a couple of new layers on top considering your particular computer vision task, and train. The results will be much more successful than a model you build from scratch.

而是尝试使用这些训练模型，考虑您的特定计算机视觉任务在顶部添加几个新层，然后进行训练。结果将比您从头开始构建的模型成功得多。