keras pytorch_使用PyTorch重新创建Keras功能API

最新推荐文章于 2024-03-21 08:33:19 发布

weixin_26632369

最新推荐文章于 2024-03-21 08:33:19 发布

阅读量550

点赞数

文章标签： java

原文链接：https://towardsdatascience.com/recreating-keras-functional-api-with-pytorch-cc2974f7143c

版权

keras pytorch

介绍 (Introduction)

The book “Deep Learning with Python” by Francois Chollet (creator of Keras) was the thing that got me into the world of deep learning. Since then I had fallen in love with the style of Keras.

Francois Chollet(Keras的创建者)写的《用Python进行深度学习》一书使我进入了深度学习领域。从那时起，我就爱上了Keras的风格。

Keras was my first framework, then jumped into a little bit of Tensorflow and then came in PyTorch, the rest is history.

Keras是我的第一个框架，然后跳入了一点Tensorflow，然后进入PyTorch，剩下的就是历史了。

To be honest, I was really excited by the progress bar that shows up during the model training in Keras, it’s just awesome :)

老实说，在Keras进行模型训练时出现的进度栏让我感到非常兴奋，这真是太棒了:)

So, why not try to bring the Keras experience of training models to PyTorch?

那么，为什么不尝试将Keras训练模型的经验引入PyTorch？

This question got me started in and I ended up recreating the dense layer, convolutional layer and flatten layer of Keras with all that fancy progress bars.

这个问题使我开始，最后我用所有花哨的进度条重新创建了Keras的密集层，卷积层和平坦层。

Models can be created by stacking one layer on top of the other and trained by simply calling the fit method which is similar to how Keras does the job.

可以通过将一层堆叠在另一层之上来创建模型，并可以通过简单地调用与Keras的工作方式类似的fit方法来对其进行训练。

让我们来建立它 (Let’s build it)

For those of you who haven’t worked with Keras, building and training a model in Keras looks something like below:

对于尚未与Keras合作的人，在Keras中构建和训练模型如下所示：

Training a fully-connected network in keras

在keras中训练完全连接的网络

1.Import the required libraries

1.导入所需的库

You may not be familiar with the library pkbar, it is used for displaying the Keras like progress bar.

您可能不熟悉库pkbar，它用于显示类似进度条的Keras。

Importing required libraries

导入所需的库

2. Input layer and dense layer

2.输入层和密集层

The input layer simply takes in the shape of a single instance of the data that will be passed to the neural network and return it, for fully-connected networks it will be something like (1, 784) and for convolutional neural network, it will be the dimensions of the image(height×width×channels).

输入层只是采用将传递到神经网络并返回数据的单个实例的形状，对于完全连接的网络，它将类似于(1，784)，对于卷积神经网络，它将是图像的尺寸(高度×宽度×通道)。

Using capital letters for naming python function is against the rules, but we will neglect it for the time being(some parts of Keras source code uses the same convention).

使用大写字母命名python函数是违反规则的，但我们暂时将其忽略(Keras源代码的某些部分使用相同的约定)。

Input layer

输入层

The dense class is initialized by passing the number of output neurons and activation function for that layer. When the dense layer is called, the previous layers is passed as the input.

通过传递输出神经元的数量和该层的激活函数来初始化密集类。调用密集层时，先前的层将作为输入传递。

Now we have the information about the previous layer. If the previous layer is input layer, a PyTorch linear layer is created with shape returned from the input layer and the number of output neurons provided as an argument during dense class initialization.

现在我们有了有关上一层的信息。如果上一层是输入层，则会创建一个PyTorch线性层，其形状从输入层返回，并在密集类初始化期间提供作为参数的输出神经元数。

If the previous layer is a dense layer, we extend the neural network by adding a PyTorch linear layer and an activation layer provided to the dense class by the user.

如果前一层是密集层，我们通过添加PyTorch线性层和用户提供给密集类的激活层来扩展神经网络。

And if the previous layer is a convolution or flatten layer, we will create a utility function called get_conv_output() to get the output shape of the image after passing through the convolution and flatten layers. This dimension is required because we cannot create linear layer in PyTorch without passing a value to the in_features argument.

如果前一层是卷积层或展平层，我们将创建一个名为get_conv_output()的实用程序函数，以在经过卷积层和展平层后获得图像的输出形状。此尺寸是必需的，因为如果不将值传递给in_features参数，则无法在PyTorch中创建线性图层。

The get_conv_output() function takes in the image shape and the convolution neural network model as input. It then creates a dummy tensor with the same shape as the image and passes it to the convolutional network(with flatten layer) and returns the size of the data coming out of it, this size is passed as value to the in_features argument in PyTorch’s linear layer.

get_conv_output()函数采用图像形状和卷积神经网络模型作为输入。然后创建一个与图像形状相同的虚拟张量，然后将其传递到卷积网络(具有平坦层)，并返回从其中得到的数据的大小，该大小作为值传递给PyTorch线性模型中的in_features参数层。

Dense layer

致密层

3. Flatten layer

3.展平层

For the purpose of creating a flatten layer, we will be creating a custom layer class called flattened layer which takes in a tensor as input and returns the flattened version of the tensor during the forward propagation.

为了创建扁平化层，我们将创建一个称为flattened layer的自定义层类，该类将一个张量作为输入并在正向传播期间返回该张量的扁平化版本。

We will create another class called flatten, when this layer is called, the previous layers is passed as input, then the flatten class extends the network by adding our custom created flattened layer class on top of the previous layers.

我们将创建另一个名为flatten的类，当调用此层时，会将先前的层作为输入传递，然后通过在先前层的顶部添加我们自定义创建的flattened层类来扩展该网络。

Thus all the data coming to the flatten layer is flattened using our custom created flattened layer class.

因此，使用我们自定义创建的展平层类将展平层中的所有数据展平。

Flatten layer

展平层

4. Convolutional layer

4.卷积层

We will initialize the Conv2d layer by passing in the number of filters, kernel size, strides, padding, dilation and activation function.

我们将通过传入过滤器的数量，内核大小，步幅，填充，扩张和激活函数来初始化Conv2d层。

Now, when the Conv2d layer is called, the previous layers is passed to it, if the previous layer is Input layer, a single PyTorch Conv2d layer with the provided values of number of filters, kernel size, strides, padding, dilation and activation function is created where the value of in_channels is taken from the number of channels in the input shape.

现在，当Conv2d层被调用时，先前的层将传递给它，如果先前的层是Input层，则是一个PyTorch Conv2d层，其提供的值包括过滤器数量，内核大小，步幅，填充，扩张和激活函数在从输入形状中的通道数获取in_channels值的位置创建。

If the previous layer is a convolutional layer, previous layer is extended by adding a PyTorch Conv2d layer and activation function with the value of in_channels taken from the out_channels of previous layer.

如果前一层是卷积层，则通过添加PyTorch Conv2d层和激活函数来扩展前一层，该函数具有从前一层的out_channels中获取的in_channels值。

In the case of padding, if the user needs to preserve the dimensions of data going out of that layer, then the value of padding can be specified as ‘same’ instead of an integer.

在填充的情况下，如果用户需要保留从该层出来的数据的尺寸，则可以将填充的值指定为“相同”而不是整数。

If the value of padding is specified as ‘same’, then a utility function called same_pad() is used to get the value of padding to preserve the dimensions for a given input size, kernel size, stride and dilation.

如果将padding的值指定为“ same”，则将使用一个名为same_pad()的实用函数来获取padding的值，以保留给定输入大小，内核大小，步幅和膨胀的尺寸。

The input size can be obtained using the get_conv_output() utility function discussed earlier.

可以使用前面讨论的get_conv_output()实用程序函数获得输入大小。

5. Model class

5.模型类

After building the architecture of our model, the Model class is initialized by passing in the input layer and the output layer. But I have given an extra argument called device which is not present in Keras, this argument takes in the value as either ‘CPU’ or ‘CUDA’ which will move the entire model to the specified device.

构建完模型的体系结构后，通过传入输入层和输出层来初始化Model类。但是我给了一个额外的参数，称为设备，它在Keras中不存在，该参数接受的值是'CPU'或'CUDA'，它将把整个模型移动到指定的设备。

The model class’s parameters method is used to return the parameters of the model which is to be given to PyTorch optimizer.

模型类的参数方法用于返回要提供给PyTorch优化器的模型参数。

The model class has a method called compile which takes in the optimizer and loss function needed for training the model. The summary method of model class displays the summary of created model with the help of torchsummary library.

模型类具有一种称为compile的方法，该方法采用了优化器和训练模型所需的损失函数。模型类的summary方法借助torchsummary库显示所创建模型的摘要。

The fit method is used for training the model, this method takes the input feature set, target data set and the number of epochs as argument. It displays the loss calculated by the loss function and progress of the training using the pkbar library.

fit方法用于训练模型，该方法以输入特征集，目标数据集和历元数为参数。它显示由损失函数计算的损失以及使用pkbar库的训练进度。

The evaluate method is used to calculate the loss and accuracy on the test data.

评估方法用于计算测试数据的损失和准确性。

The fit_generator, evaluate_generator and predict_generator is used when the data is loaded using PyTorch data loader. The fit_generator takes the training set data loader and epochs as arguments. The evaluate_generator and predict_generator takes the validation set data loader and test data loader respectively to measure how well the model is performing on unseen data.

当使用PyTorch数据加载器加载数据时，将使用fit_generator，valuate_generator和predict_generator。 fit_generator将训练集数据加载器和纪元作为参数。 valuate_generator和predict_generator分别使用验证集数据加载器和测试数据加载器来衡量模型在看不见的数据上的性能。

Model class

模型类

最后的想法 (Final thoughts)

I’ve tested the code on CIFAR100, CIFAR10 and MNIST data set using both dense layer and convolutional neural networks. It works fine, but there is a huge space for improvement.

我已经使用密集层和卷积神经网络在CIFAR100，CIFAR10和MNIST数据集上测试了代码。它工作正常，但仍有很大的改进空间。

This was a fun project that I was working for 3–4 days and it really pushed my limits of programming with PyTorch.

这是一个有趣的项目，我花了3-4天的时间，这确实突破了我使用PyTorch编程的极限。

You can take a look at the complete code with training done on the above-mentioned data sets here or you can freely tweak the code to suit your liking in colab.

您可以在完整的代码看看与上述数据集训练做在这里也可以自由调整，以满足您在喜欢的代码colab 。