航拍仙人掌识别_使用转移学习识别空中仙人掌

最新推荐文章于 2024-08-16 18:55:06 发布

weixin_26636643

最新推荐文章于 2024-08-16 18:55:06 发布

阅读量315

点赞数

文章标签： python 人工智能机器学习深度学习语音识别

原文链接：https://medium.com/analytics-vidhya/aerial-cactus-identification-using-transfer-learning-49c7f98e2259

版权

航拍仙人掌识别

Transfer learning is a useful strategy for applications of image like classification and detection when we don’t have enough data to train the model. This also enables us to train the model with little modification in the pre trained architecture, like adding a layer on top of the model and only training the added layer. Unlike training the model from scratch this can be really handy as the number of parameters required to train the model would be less which would ensure less computational cost. This blog explains application of transfer learning to detect if the image has cactus. To do so we shall use a dataset from kaggle.

当我们没有足够的数据来训练模型时，转移学习对于图像的分类和检测等应用是一种有用的策略。这也使我们能够在预训练的体系结构中进行很少修改的情况下训练模型，例如在模型之上添加一个层并仅训练添加的层。与从头开始训练模型不同，这非常方便，因为训练模型所需的参数数量会更少，这将确保更少的计算成本。该博客介绍了转移学习在检测图像中是否有仙人掌的应用。为此，我们将使用kaggle的数据集。

数据集： (DATASET:)

The dataset is taken from kaggle’s ‘Aerial Cactus Identification’ competition. To read the dataset we can use pandas. Following line does the job of reading the csv file for us.

该数据集来自kaggle的“ 空中仙人掌识别 ”竞赛。要读取数据集，我们可以使用熊猫。下一行为我们完成了读取csv文件的工作。


df_train=pd.read_csv('/kaggle/input/aerial-cactus-identification/train.csv')
df_test = pd.read_csv('/kaggle/input/aerial-cactus-identification/test.csv’)

The csv file has the image id and its corresponding label whether the image has cactus or not. So this is a binary classification problem. To get various statistics about the labels we can use the value_counts method. The following code snippet does it for us.

csv文件具有图像ID及其相应的标签，无论图像是否具有仙人掌。因此，这是一个二进制分类问题。要获取有关标签的各种统计信息，我们可以使用value_counts方法。以下代码段为我们做到了。

df_train[‘has_cactus’].value_counts()

The above code snippet gives us the output of the number of labels from each class.

上面的代码段为我们提供了每个类的标签数量的输出。

To visualize an image from the dataset we can take the help of the opencv library. The following code snippet does the job of showing a picture.

为了可视化来自数据集的图像，我们可以借助opencv库。以下代码段完成了显示图片的工作。

img=cv2.imread(‘/home/aditya123/Downloads/aerial-cactus-identification/train/0a02cef73a1660adad8fc43b07fa9d23.jpg’)
plt.imshow(img)

The output of the above snippet is the following image

上面代码段的输出是下图

预处理： (Preprocessing:)

The dataset provided by kaggle has all the images in one folder and corresponding labels in a separate csv file. To split the data into training and validation set I have used numpy. The first 15000 records are considered as part of the training data whereas the rest are considered as the validation data. To put the data as input to the model I have used the flow_from_dataframe method. This method enables us to use the power of ImageDataGenerator for data augmentation. The given csv file maps the filenames of the training images to their respective classes. Let us see how to do the same in the below code snippet.

kaggle提供的数据集将所有图像保存在一个文件夹中，并将相应的标签保存在单独的csv文件中。要将数据分为训练和验证集，我使用了numpy。前15000条记录被视为训练数据的一部分，而其余部分被视为验证数据。为了将数据输入模型，我使用了flow_from_dataframe方法。这种方法使我们能够使用ImageDataGenerator的功能进行数据增强。给定的csv文件将训练图像的文件名映射到它们各自的类。让我们看看如何在下面的代码片段中执行相同的操作。

datagen = ImageDataGenerator(rescale=1/255)batch_size=150train_generator=datagen.flow_from_dataframe(dataframe=df_train[:15001],directory=train_dir,x_col=’id’,
 y_col=’has_cactus’,class_mode = ‘binary’,batch_size=batch_size,
 target_size=(299,299))
validation_generator=datagen.flow_from_dataframe(dataframe=df_train[15000:],directory = train_dir,x_col=’id’,
 y_col=’has_cactus’,class_mode=’binary’,batch_size=batch_size,
 target_size=(299,299))

As per the image augmentation part I have chosen to rescale the image pixel values within the range 0 to 1. This method creates a batch of image with its corresponding labels where the pixel values are in between 0 and 1. So now our dataset is ready to be input to the model.

根据图像增强部分，我选择在0到1的范围内重新缩放图像像素值。此方法将创建一批带有相应标签的图像，其中像素值介于0和1之间。因此，我们的数据集已准备就绪输入模型。

造型： (Modelling:)

As our dataset is ready after preprocessing, it is the time to build the model. We can always try to create a model from scratch. However it would not be a great idea since it would take a lot of time because of large number of computation. Also there is no guarantee that the model would converge well after the training is over. So we would just reuse the weights of the already existing model that was trained on the imagenet dataset while keeping some parts of the network open for training.

由于我们的数据集经过预处理后就可以使用了，现在该构建模型了。我们始终可以尝试从头开始创建模型。但是，这并不是一个好主意，因为大量的计算会花费很多时间。同样，也不能保证训练结束后模型可以很好地收敛。因此，我们只需要重用在imagenet数据集上已训练的现有模型的权重，同时使网络的某些部分保持开放状态即可进行训练。

In this problem we would use the inceptionv3 architecture. The detailed analysis of this model can be found at https://arxiv.org/pdf/1409.4842.pdf. Here we are concerned with the implementation of the model. This model has 48 layers. And has been trained with the imagenet dataset which has 1000 classes. Since this model has been trained on such wide range images it has ability to learn the feature representation of any image.

在这个问题中，我们将使用inceptionv3体系结构。可在https://arxiv.org/pdf/1409.4842.pdf上找到此模型的详细分析。在这里，我们关注模型的实现。该模型有48层。并已使用具有1000个类别的imagenet数据集进行了训练。由于已经在如此宽范围的图像上训练了该模型，因此它具有学习任何图像的特征表示的能力。

To load the model we can take the help of keras library. The following code snippet does the loading of the model.

要加载模型，我们可以借助keras库。以下代码段将加载模型。

model =InceptionV3(weights=’imagenet’,include_top=False,input_shape=(299,299,3))

Here we have mentioned the input shape as (299,299,3) as that is the shape in which the original model had got inputs.

在这里，我们将输入形状称为(299,299,3)，因为它是原始模型获得输入的形状。

We have the option of freezing the weight of original architecture or we can train some part of it with our new dataset. Here I have chosen to train some part of the original architecture. For example the following code snippet freezes the weight of the original architecture upto the 5th layer(1st 5 layer).

我们可以选择冻结原始架构的权重，也可以使用新的数据集训练其中的一部分。在这里，我选择训练原始体系结构的某些部分。例如，以下代码片段将原始体系结构的权重冻结到了第5层(第1 5层)。

for layer in model.layers[:5]:
    layer.trainable = False

The weight of the remaining 43 layers shall change using our training dataset.

其余43层的权重将使用我们的训练数据集进行更改。

Similarly we can create rest part of the architecture. Here we would create a dense layer so that it can be used for classification. To put the features generated by the earlier layers into the dense layer we need to flatten it first. This dense layers would take the output of the inception net as input.

同样，我们可以创建架构的其余部分。在这里，我们将创建一个密集层，以便将其用于分类。要将早期图层生成的要素放到密集图层中，我们需要先对其进行展平。此密集层将初始网络的输出作为输入。

x = model.output
x = Flatten()(x)
x = Dense(256,activation = ‘relu’)(x)
x = Dropout(0.2)(x)
x = Dense(512,activation=”relu”)(x)
x = Dropout(0.2)(x)
pred = Dense(1,activation=’sigmoid’)(x)

Here we have chosen the activation function as sigmoid as it is a binary classification problem. Using the keras functional api we can define the final model as

在这里，我们选择激活函数为S型，因为它是一个二进制分类问题。使用keras功能api，我们可以将最终模型定义为

model_final = Model(inputs=model.input,outputs = pred)

To train the model later with fit_generator we need to specify a loss function,an optimizer and a metric to monitor. The following code snippet does the same.

为了稍后使用fit_generator训练模型，我们需要指定损失函数，优化器和要监控的指标。以下代码段执行相同的操作。

model_final.compile(optimizer=’Adam’,loss=’binary_crossentropy’,metrics=[‘acc’])

Now let us try to understand the fit_generator method. This fit_generator method is really good when the dataset we use is very large. Also this is a really useful when image augmentation is applied on the image. Since during the image data augmentation the database we use is no longer static fit_generator which accepts data in batches is really handy.

现在让我们尝试了解fit_generator方法。当我们使用的数据集非常大时，这种fit_generator方法非常好。当在图像上应用图像增强时，这也非常有用。由于在图像数据扩充期间，我们使用的数据库不再是static fit_generator ，它可以分批接收数据，因此非常方便。

Let us look another parameter of the fit_generator method which is the steps_per_epoch. We have defined as the number of training data points divided the batch size. It is a very important parameter as it tells the model when a new epoch starts. The data generator is supposed to run infinitely which means it has no ability to know when a new epoch starts. So this parameter helps in knowing when a new epoch starts.

让我们看一下fit_generator方法的另一个参数，即steps_per_epoch 。我们定义为训练数据点的数量除以批次大小 。这是一个非常重要的参数，因为它告诉模型新的纪元何时开始。数据生成器应该无限运行，这意味着它无法知道新纪元何时开始。因此，此参数有助于了解新纪元何时开始。

training_steps=train_generator.n/train_generator.batch_sizeval_steps=validation_generator.n//validation_generator.batch_sizemodel_final.fit_generator(train_generator,steps_per_epoch=training_steps,validation_data=validation_generator,validation_steps=val_steps)

The above code snippet describes the method we discussed to train the model.

上面的代码段描述了我们讨论的训练模型的方法。

Here I have run the model only for one epoch. After that I have got a validation accuracy of 77 percent. By trying different architecture we can achieve better result and accuracy.

在这里，我只运行了一个时期的模型。之后，我的验证精度为77％。通过尝试不同的体系结构，我们可以获得更好的结果和准确性。

The link to the whole code can be found at https://github.com/mohantyaditya/Aerial-Cactus-Detection.

整个代码的链接可以在https://github.com/mohantyaditya/Aerial-Cactus-Detection找到。

翻译自: https://medium.com/analytics-vidhya/aerial-cactus-identification-using-transfer-learning-49c7f98e2259

航拍仙人掌识别

weixin_26636643

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
航拍仙人掌识别_使用转移学习识别空中仙人掌

航拍仙人掌识别Transfer learning is a useful strategy for applications of image like classification and detection when we don’t have enough data to train the model. This also enables us to train the model wit...
复制链接

扫一扫