Kaggle入侵物种检测VGG16示例——基于Keras

最新推荐文章于 2024-07-31 17:37:25 发布

qilixuening

最新推荐文章于 2024-07-31 17:37:25 发布

阅读量4.3k

点赞数 4

分类专栏：自学练习文章标签： kaggle VGG16 入侵物种检测 keras

本文链接：https://blog.csdn.net/qilixuening/article/details/77511146

版权

该博客介绍了如何使用Keras的VGG16模型解决Kaggle的入侵物种监测问题。通过数据预处理、训练集验证集划分，构建并训练不包含全连接层的VGG16模型，然后添加自定义全连接层，最终在测试集上达到86%的准确率。文章还涉及了ImageDataGenerator用于数据增强，以及测试集预测的注意事项。

摘要由CSDN通过智能技术生成

根据Kaggle: Invasive Species Monitoring问题的描述，我们需要对图像是否包含入侵物种进行判断，也就是对图片进行而分类（0：图像中不含入侵物种；1：图像中含有入侵物种），据给出的数据（训练集2295张图及类别，测试集1531张图），很显然，这种图像分类任务很适合用CNN来解决，Kera的应用模块Application提供了带有预训练权重的Keras模型，如Xception, VGG16, VGG19, ResNet50, InceptionV3(仅支持tensorlow后端)，这些模型可以用来进行预测、特征提取和finetune。并且根据这些模型的“瓶颈”特征，我们可以直接加载预训练好的模型，在基本不影响CNN准确率的情况下减少了训练花费，方便快捷。为了示范，本文只演示VGG16模型。

首先导入需要预处理的库。

import os
import numpy as np
import pandas as pd
import h5py
import matplotlib.pyplot as plt
%matplotlib inline

trainpath = str('E:\\kaggle\invasive_species\\train\\')
testpath = str('E:\\kaggle\\invasive_species\\test\\')
n_tr = len(os.listdir(trainpath))
print('num of training files: ', n_tr)

num of training files: 2295

可以先查看train_labels.csv 的具体情况，由下面表格可见，数据已经是打乱的，标记为0、1的样本随机排列。

train_labels = pd.read_csv('E:\\kaggle\invasive_species\\train_labels.csv')
train_labels.head()

	name	invasive
0	1	0
1	2	0
2	3	1
3	4	0
4	5	1

可以先分别可视化一下标记为0和1的样本的图像，看他们具体长什么样：

from skimage import io, transform

sample_image = io.imread(trainpath + '1.jpg')
print('Height:{0} Width:{1}'.format(*sample_image.shape))
plt.imshow(sample_image)

Height:866 Width:1154
<matplotlib.image.AxesImage at 0x1f1b54f82b0>

sample_image = io.imread(trainpath + '3.jpg')
plt.imshow(sample_image)

# There is one image in the test set that has reversed dimensions.
# print(io.imread(testpath + '1068.jpg').shape)

<matplotlib.image.AxesImage at 0x25073e4d208>