读《Hands-On Transfer Learning with Python》小试锋芒之释放迁移学习的洪荒之力

最新推荐文章于 2023-04-15 22:00:48 发布

阿尔法旺旺

最新推荐文章于 2023-04-15 22:00:48 发布

阅读量818

点赞数 1

分类专栏：深度学习 ML 开发技术

本文链接：https://blog.csdn.net/yingwei13mei/article/details/90145197

版权

ML 同时被 3 个专栏收录

36 篇文章 0 订阅

订阅专栏

深度学习

33 篇文章 0 订阅

订阅专栏

开发技术

11 篇文章 0 订阅

订阅专栏

本章的主要内容
 The need for transfer learning
 Building Convolutional Neural Network (CNN) models from scratch:
Building a basic CNN model
Improving our CNN model with regularization
Improving our CNN model with image augmentation
 Leveraging transfer learning with pretrained CNN models:
Using a pretrained model as a feature extractor
Improving our pretrained model with image augmentation
Improving our pretrained model with fine-tuning
 Model performance evaluation
The code for this chapter will be available in the Chapter 5 folder in the GitHub repository at https://github.com/dipanjanS/handson-transfer-learning-with-python which you can refer to as needed to follow along with the chapter.

The need for transfer learning

需要注意的是迁移学习不总是应用在深度学习相关领域，并且它在深度学习之前很早就出现了。
我们的想法是：通常像专家那样使用图像分类的预训练模型，以减少数据样本少的约束情况来解决我们的问题。

Formulating our real-world problem

The dataset we will be using comes from the Dogs vs. Cats challenge (https://www.kaggle.com/c/dogs-vscats/data)，whereas our primary objective is to build a model that can successfully recognize and categorize images into either a cat or a dog.
To start, download the train.zip file from the dataset page and store it in your local system. Once downloaded, unzip it into a folder. This folder will contain 25,000 images of dogs and cats; that is, 12,500 images per category

Building our dataset

Datasets Builder.ipynb：we have 3,000 images for training, 1,000 images for validation, and 1,000 images for our test dataset。
Cat datasets: (1500,) (500,) (500,)
Dog datasets: (1500,) (500,) (500,)

Test Images	Validation Images	Training Images
1000	1000	3000

Formulating our approach

We will start by building simple CNN models from scratch, then try to improve using techniques such as regularization and image augmentation. Then, we will try and leverage pretrained models to unleash the true power of transfer learning!

Building CNN models from scratch

1.	from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout  
2.	from keras.models import Sequential  
3.	from keras import optimizers  
4.	  
5.	model = Sequential()  
6.	  
7.	model.add(Conv2D(16, kernel_size=(3, 3), activation='relu',   
8.	                 input_shape=input_shape))  
9.	model.add(MaxPooling2D(pool_size=(2, 2)))  
10.	  
11.	model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))  
12.	model.add(MaxPooling2D(pool_size=(2, 2)))  
13.	  
14.	model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))  
15.	model.add(MaxPooling2D(pool_size=(2, 2)))  
16.	  
17.	model.add(Flatten())  
18.	model.add(Dense(512, activation='relu'))  
19.	model.add(Dense(1, activation='sigmoid'))  
20.	  
21.	  
22.	model.compile(loss='binary_crossentropy',  
23.	              optimizer=optimizers.RMSprop(),  
24.	              metrics=['accuracy'])  
25.	  
26.	model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 148, 148, 16)      448       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 74, 74, 16)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 72, 72, 64)        9280      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 36, 36, 64)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 34, 34, 128)       73856     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 17, 17, 128)       0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 36992)             0         
_________________________________________________________________
dense_1 (Dense)              (None, 512)               18940416  
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 513       
=================================================================
Total params: 19,024,513
Trainable params: 19,024,513
Non-trainable params: 0
_________________________________________________________________

发现过拟合，使用了dropout

1.	model = Sequential()  
2.	# convolutional and pooling layers  
3.	model.add(Conv2D(16, kernel_size=(3, 3), activation='relu',  
4.	input_shape=input_shape))  
5.	model.add(MaxPooling2D(pool_size=(2, 2)))  
6.	model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))  
7.	model.add(MaxPooling2D(pool_size=(2, 2)))  
8.	model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))  
9.	model.add(MaxPooling2D(pool_size=(2, 2)))  
10.	model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))  
11.	model.add(MaxPooling2D(pool_size=(2, 2)))  
12.	model.add(Flatten())  
13.	model.add(Dense(512, activation='relu'))  
14.	model.add(Dropout(0.3))  
15.	model.add(Dense(512, activation='relu'))  
16.	model.add(Dropout(0.3))  
17.	model.add(Dense(1, activation='sigmoid'))  
18.	model.compile(loss='binary_crossentropy',  
19.	optimizer=optimizers.RMSprop(),  
20.	metrics=['accuracy'])  
21.	model.compile(loss='binary_crossentropy',
22.	              optimizer=optimizers.RMSprop(),
23.	              metrics=['accuracy'])
24.	
25.	model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_4 (Conv2D)            (None, 148, 148, 16)      448       
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 74, 74, 16)        0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 72, 72, 64)        9280      
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 36, 36, 64)        0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 34, 34, 128)       73856     
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 17, 17, 128)       0         
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 15, 15, 128)       147584    
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 7, 7, 128)         0         
_________________________________________________________________
flatten_2 (Flatten)          (None, 6272)              0         
_________________________________________________________________
dense_3 (Dense)              (None, 512)               3211776   
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_4 (Dense)              (None, 512)               262656    
_________________________________________________________________
dropout_2 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_5 (Dense)              (None, 1)                 513       
=================================================================
Total params: 3,706,113
Trainable params: 3,706,113
Non-trainable params: 0
_________________________________________________________________

稍微有所改善，validation ACC 在78%左右。
模型过拟合的一个重要原因是少量的数据使得每次epoch的训练中看到的是相同的训练样本，所以需要用到data augmentation。

train_datagen = ImageDataGenerator(rescale=1./255, zoom_range=0.3, rotation_range=50,
                                   width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2, 
                                   horizontal_flip=True, fill_mode='nearest')

val_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow(train_imgs, train_labels_enc, batch_size=30)
val_generator = val_datagen.flow(validation_imgs, validation_labels_enc, batch_size=20)

Leveraging transfer learning with pretrained CNN models

Pretrained models are used in the following two popular ways when building new models or reusing them:
 Using a pretrained model as a feature extractor
 Fine-tuning the pretrained model

Understanding the VGG-16 model

在这里插入图片描述

For the last model, we will apply fine-tuning to the VGG model, where we will unfreeze the last two blocks (Block 4 and Block 5) so that their weights get updated in each epoch (per batch of data) as we train our own model.

Pretrained CNN model as a feature extractor

可以看到我们已经删除了与该分类器相关的VGG-16模型分类器的最后部分，因次我们将建立自己的分类器并利用VGG作为特征提取器。

from keras.applications import vgg16
from keras.models import Model
import keras

vgg = vgg16.VGG16(include_top=False, weights='imagenet', 
                                     input_shape=input_shape)

output = vgg.layers[-1].output
output = keras.layers.Flatten()(output)

vgg_model = Model(vgg.input, output)
vgg_model.trainable = False

for layer in vgg_model.layers:
    layer.trainable = False

vgg_model.summary()

这里需要注意的是，vgg = vgg16.VGG16(include_top=False, weights=‘imagenet’，input_shape=input_shape) 会自动去github download vgg的模型文件vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5，由于文件较大，受网速和墙的因素可能会中断，因此可以提前将文件下载好放到指定目录，这里特别指出需要放置的路径：

如果是Linux环境
cd ~/.keras/models
如果是windows环境
C:\Users\XXX\.keras\models\

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         (None, 150, 150, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 150, 150, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 150, 150, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 75, 75, 64)        0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 75, 75, 128)       73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 75, 75, 128)       147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 37, 37, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 37, 37, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 37, 37, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 37, 37, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 18, 18, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 18, 18, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 18, 18, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 18, 18, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 9, 9, 512)         0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 4, 4, 512)         0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 8192)              0         
=================================================================
Total params: 14,714,688
Trainable params: 0
Non-trainable params: 14,714,688
_________________________________________________________________

为了确认VGG-16各层是否真正被冻结，使用如下代码：

import pandas as pd
pd.set_option('max_colwidth', -1)

layers = [(layer, layer.name, layer.trainable) for layer in vgg_model.layers]
pd.DataFrame(layers, columns=['Layer Type', 'Layer Name', 'Layer Trainable'])

在这里插入图片描述

下面先提取bottleneck层的特征。

def get_bottleneck_features(model, input_imgs):
    
    features = model.predict(input_imgs, verbose=0)
    return features
  
train_features_vgg = get_bottleneck_features(vgg_model, train_imgs_scaled)
validation_features_vgg = get_bottleneck_features(vgg_model, validation_imgs_scaled)

print('Train Bottleneck Features:', train_features_vgg.shape, 
      '\tValidation Bottleneck Features:', validation_features_vgg.shape)

Train Bottleneck Features: (3000, 8192) Validation Bottleneck Features: (1000, 8192)

开始构建模型。

1.	from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, InputLayer  
2.	from keras.models import Sequential  
3.	from keras import optimizers  
4.	  
5.	input_shape = vgg_model.output_shape[1]  
6.	  
7.	model = Sequential()  
8.	model.add(InputLayer(input_shape=(input_shape,)))  
9.	model.add(Dense(512, activation='relu', input_dim=input_shape))  
10.	model.add(Dropout(0.3))  
11.	model.add(Dense(512, activation='relu'))  
12.	model.add(Dropout(0.3))  
13.	model.add(Dense(1, activation='sigmoid'))  
14.	  
15.	model.compile(loss='binary_crossentropy',  
16.	              optimizer=optimizers.RMSprop(lr=1e-4),  
17.	              metrics=['accuracy'])  
18.	  
19.	model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 512)               4194816   
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 512)               262656    
_________________________________________________________________
dropout_2 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 513       
=================================================================
Total params: 4,457,985
Trainable params: 4,457,985
Non-trainable params: 0

开始模型训练

1.	history = model.fit(x=train_features_vgg, y=train_labels_enc,  
2.	                    validation_data=(validation_features_vgg, validation_labels_enc),  
3.	                    batch_size=batch_size,  
4.	                    epochs=epochs,  
5.	                    verbose=1)

Pretrained CNN model as a feature extractor with image augmentation

1.	train_datagen = ImageDataGenerator(rescale=1./255, zoom_range=0.3, rotation_range=50,  width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2,  horizontal_flip=True, fill_mode='nearest')  
2.	  
3.	val_datagen = ImageDataGenerator(rescale=1./255)  
4.	  
5.	train_generator = train_datagen.flow(train_imgs, train_labels_enc, batch_size=30)  
6.	val_generator = val_datagen.flow(validation_imgs, validation_labels_enc, batch_size=20)

这次没有像上次那样提取瓶颈特征，因为我们将对数据生成器进行训练;因此，我们将vgg_model对象作为输入传递到我们自己的模型.

1.	from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, InputLayer  
2.	from keras.models import Sequential  
3.	from keras import optimizers  
4.	  
5.	model = Sequential()  
6.	model.add(vgg_model)  
7.	model.add(Dense(512, activation='relu', input_dim=input_shape))  
8.	model.add(Dropout(0.3))  
9.	model.add(Dense(512, activation='relu'))  
10.	model.add(Dropout(0.3))  
11.	model.add(Dense(1, activation='sigmoid'))  
12.	  
13.	model.compile(loss='binary_crossentropy',  
14.	              optimizer=optimizers.RMSprop(lr=2e-5),  
15.	              metrics=['accuracy'])  
16.	  
17.	model.summary()

在这里插入图片描述

Pretrained CNN model with finetuning and image augmentation

冻结前三层，4-5层解冻

1.	vgg_model.trainable = True  
2.	  
3.	set_trainable = False  
4.	for layer in vgg_model.layers:  
5.	    if layer.name in ['block5_conv1', 'block4_conv1']:  
6.	        set_trainable = True  
7.	    if set_trainable:  
8.	        layer.trainable = True  
9.	    else:  
10.	        layer.trainable = False  
11.	          
12.	print("Trainable layers:", vgg_model.trainable_weights)

1.	layers = [(layer, layer.name, layer.trainable) for layer in vgg_model.layers]  
2.	pd.DataFrame(layers, columns=['Layer Type', 'Layer Name', 'Layer Trainable'])

在这里插入图片描述

1.	train_datagen = ImageDataGenerator(rescale=1./255, zoom_range=0.3, rotation_range=50,  width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2,   
2.	                                     horizontal_flip=True, fill_mode='nearest')  
3.	  
4.	val_datagen = ImageDataGenerator(rescale=1./255)  
5.	  
6.	train_generator = train_datagen.flow(train_imgs, train_labels_enc, batch_size=30)  
7.	val_generator = val_datagen.flow(validation_imgs, validation_labels_enc, batch_size=20)  
8.	  
9.	from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, InputLayer  
10.	from keras.models import Sequential  
11.	from keras import optimizers  
12.	  
13.	model = Sequential()  
14.	model.add(vgg_model)  
15.	model.add(Dense(512, activation='relu', input_dim=input_shape))  
16.	model.add(Dropout(0.3))  
17.	model.add(Dense(512, activation='relu'))  
18.	model.add(Dropout(0.3))  
19.	model.add(Dense(1, activation='sigmoid'))  
20.	  
21.	model.compile(loss='binary_crossentropy',  
22.	              optimizer=optimizers.RMSprop(lr=1e-5),  
23.	              metrics=['accuracy'])  
24.	  
25.	model.summary()

在这里插入图片描述

Evaluating our deep learning models

Model predictions on a sample test image（单个样本的测试）

取了其中一张图片，使用了前面5个模型：
 Basic CNN
 CNN with Img Augmentation
 Pre-trained CNN (Transfer Learning)
 Pre-trained CNN with Img Augmentation (Transfer Learning)
 Pre-trained CNN with Fine-tuning & Img Augmentation (Transfer Learning)
3对2错

Evaluation model performance on test data

1.	basic_cnn = load_model('cats_dogs_basic_cnn.h5')  
2.	img_aug_cnn = load_model('cats_dogs_cnn_img_aug.h5')  
3.	tl_cnn = load_model('cats_dogs_tlearn_basic_cnn.h5')  
4.	tl_img_aug_cnn = load_model('cats_dogs_tlearn_img_aug_cnn.h5')  
5.	tl_img_aug_finetune_cnn = load_model('cats_dogs_tlearn_finetune_img_aug_cnn.h5')  
6.	predictions = basic_cnn.predict_classes(test_imgs_scaled, verbose=0)  
7.	predictions = num2class_label_transformer(predictions)  
8.	  
9.	meu.display_model_performance_metrics(true_labels=test_labels, predicted_labels=predictions,   
10.	                                      classes=list(set(test_labels)))  
11.	  
12.	predictions = img_aug_cnn.predict_classes(test_imgs_scaled, verbose=0)  
13.	predictions = num2class_label_transformer(predictions)  
14.	  
15.	meu.display_model_performance_metrics(true_labels=test_labels, predicted_labels=predictions,   
16.	                                      classes=list(set(test_labels)))  
17.	  
18.	test_bottleneck_features = get_bottleneck_features(vgg_model, test_imgs_scaled)  
19.	  
20.	predictions = tl_cnn.predict_classes(test_bottleneck_features, verbose=0)  
21.	predictions = num2class_label_transformer(predictions)  
22.	  
23.	meu.display_model_performance_metrics(true_labels=test_labels, predicted_labels=predictions,   
24.	                                      classes=list(set(test_labels)))  
25.	  
26.	predictions = tl_img_aug_cnn.predict_classes(test_imgs_scaled, verbose=0)  
27.	predictions = num2class_label_transformer(predictions)  
28.	  
29.	meu.display_model_performance_metrics(true_labels=test_labels, predicted_labels=predictions, classes=list(set(test_labels)))  
30.	  
31.	predictions = tl_img_aug_finetune_cnn.predict_classes(test_imgs_scaled, verbose=0)  
32.	predictions = num2class_label_transformer(predictions)  
33.	  
34.	meu.display_model_performance_metrics(true_labels=test_labels, predicted_labels=predictions,   
35.	                                      classes=list(set(test_labels)))

最差的一个模型和最好的一个模型,分别是basic_cnn和Pre-trained CNN with Fine-tuning & Img Augmentation。
在这里插入图片描述

阿尔法旺旺

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
读《Hands-On Transfer Learning with Python》小试锋芒之释放迁移学习的洪荒之力

本章的主要内容 The need for transfer learning Building Convolutional Neural Network (CNN) models from scratch:Building a basic CNN modelImproving our CNN model with regularizationImproving our CNN mod...
复制链接

扫一扫

专栏目录