读《Hands-On Transfer Learning with Python》小试锋芒之释放迁移学习的洪荒之力

33 篇文章 0 订阅
11 篇文章 0 订阅

本章的主要内容
 The need for transfer learning
 Building Convolutional Neural Network (CNN) models from scratch:
Building a basic CNN model
Improving our CNN model with regularization
Improving our CNN model with image augmentation
 Leveraging transfer learning with pretrained CNN models:
Using a pretrained model as a feature extractor
Improving our pretrained model with image augmentation
Improving our pretrained model with fine-tuning
 Model performance evaluation
The code for this chapter will be available in the Chapter 5 folder in the GitHub repository at https://github.com/dipanjanS/handson-transfer-learning-with-python which you can refer to as needed to follow along with the chapter.

The need for transfer learning

需要注意的是迁移学习不总是应用在深度学习相关领域,并且它在深度学习之前很早就出现了。
我们的想法是:通常像专家那样使用图像分类的预训练模型,以减少数据样本少的约束情况来解决我们的问题。

Formulating our real-world problem

The dataset we will be using comes from the Dogs vs. Cats challenge (https://www.kaggle.com/c/dogs-vscats/data),whereas our primary objective is to build a model that can successfully recognize and categorize images into either a cat or a dog.
To start, download the train.zip file from the dataset page and store it in your local system. Once downloaded, unzip it into a folder. This folder will contain 25,000 images of dogs and cats; that is, 12,500 images per category

Building our dataset

Datasets Builder.ipynb:we have 3,000 images for training, 1,000 images for validation, and 1,000 images for our test dataset。
Cat datasets: (1500,) (500,) (500,)
Dog datasets: (1500,) (500,) (500,)

Test ImagesValidation ImagesTraining Images
100010003000

Formulating our approach

We will start by building simple CNN models from scratch, then try to improve using techniques such as regularization and image augmentation. Then, we will try and leverage pretrained models to unleash the true power of transfer learning!

Building CNN models from scratch

1.	from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout  
2.	from keras.models import Sequential  
3.	from keras import optimizers  
4.	  
5.	model = Sequential()  
6.	  
7.	model.add(Conv2D(16, kernel_size=(3, 3), activation='relu',   
8.	                 input_shape=input_shape))  
9.	model.add(MaxPooling2D(pool_size=(2, 2)))  
10.	  
11.	model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))  
12.	model.add(MaxPooling2D(pool_size=(2, 2)))  
13.	  
14.	model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))  
15.	model.add(MaxPooling2D(pool_size=(2, 2)))  
16.	  
17.	model.add(Flatten())  
18.	model.add(Dense(512, activation='relu'))  
19.	model.add(Dense(1, activation='sigmoid'))  
20.	  
21.	  
22.	model.compile(loss='binary_crossentropy',  
23.	              optimizer=optimizers.RMSprop(),  
24.	              metrics=['accuracy'])  
25.	  
26.	model.summary()  

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 148, 148, 16)      448       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 74, 74, 16)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 72, 72, 64)        9280      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 36, 36, 64)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 34, 34, 128)       73856     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 17, 17, 128)       0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 36992)             0         
_________________________________________________________________
dense_1 (Dense)              (None, 512)               18940416  
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 513       
=================================================================
Total params: 19,024,513
Trainable params: 19,024,513
Non-trainable params: 0
_________________________________________________________________

发现过拟合,使用了dropout

1.	model = Sequential()  
2.	# convolutional and pooling layers  
3.	model.add(Conv2D(16, kernel_size=(3, 3), activation='relu',  
4.	input_shape=input_shape))  
5.	model.add(MaxPooling2D(pool_size=(2, 2)))  
6.	model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))  
7.	model.add(MaxPooling2D(pool_size=(2, 2)))  
8.	model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))  
9.	model.add(MaxPooling2D(pool_size=(2, 2)))  
10.	model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))  
11.	model.add(MaxPooling2D(pool_size=(2, 2)))  
12.	model.add(Flatten())  
13.	model.add(Dense(512, activation='relu'))  
14.	model.add(Dropout(0.3))  
15.	model.add(Dense(512, activation='relu'))  
16.	model.add(Dropout(0.3))  
17.	model.add(Dense(1, activation='sigmoid'))  
18.	model.compile(loss='binary_crossentropy',  
19.	optimizer=optimizers.RMSprop(),  
20.	metrics=['accuracy'])  
21.	model.compile(loss='binary_crossentropy',
22.	              optimizer=optimizers.RMSprop(),
23.	              metrics=['accuracy'])
24.	
25.	model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_4 (Conv2D)            (None, 148, 148, 16)      448       
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 74, 74, 16)        0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 72, 72, 64)        9280      
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 36, 36, 64)        0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 34, 34, 128)       73856     
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 17, 17, 128)       0         
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 15, 15, 128)       147584    
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 7, 7, 128)         0         
_________________________________________________________________
flatten_2 (Flatten)          (None, 6272)              0         
_________________________________________________________________
dense_3 (Dense)              (None, 512)               3211776   
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_4 (Dense)              (None, 512)               262656    
_________________________________________________________________
dropout_2 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_5 (Dense)              (None, 1)                 513       
=================================================================
Total params: 3,706,113
Trainable params: 3,706,113
Non-trainable params: 0
_________________________________________________________________

稍微有所改善,validation ACC 在78%左右。
模型过拟合的一个重要原因是少量的数据使得每次epoch的训练中看到的是相同的训练样本,所以需要用到data augmentation。

train_datagen = ImageDataGenerator(rescale=1./255, zoom_range=0.3, rotation_range=50,
                                   width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2, 
                                   horizontal_flip=True, fill_mode='nearest')

val_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow(train_imgs, train_labels_enc, batch_size=30)
val_generator = val_datagen.flow(validation_imgs, validation_labels_enc, batch_size=20)

Leveraging transfer learning with pretrained CNN models

Pretrained models are used in the following two popular ways when building new models or reusing them:
 Using a pretrained model as a feature extractor
 Fine-tuning the pretrained model

Understanding the VGG-16 model

在这里插入图片描述

在这里插入图片描述

For the last model, we will apply fine-tuning to the VGG model, where we will unfreeze the last two blocks (Block 4 and Block 5) so that their weights get updated in each epoch (per batch of data) as we train our own model.

Pretrained CNN model as a feature extractor

可以看到我们已经删除了与该分类器相关的VGG-16模型分类器的最后部分,因次我们将建立自己的分类器并利用VGG作为特征提取器。

from keras.applications import vgg16
from keras.models import Model
import keras

vgg = vgg16.VGG16(include_top=False, weights='imagenet', 
                                     input_shape=input_shape)

output = vgg.layers[-1].output
output = keras.layers.Flatten()(output)

vgg_model = Model(vgg.input, output)
vgg_model.trainable = False

for layer in vgg_model.layers:
    layer.trainable = False

vgg_model.summary()

这里需要注意的是,vgg = vgg16.VGG16(include_top=False, weights=‘imagenet’,input_shape=input_shape) 会自动去github download vgg的模型文件vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5,由于文件较大,受网速和墙的因素可能会中断,因此可以提前将文件下载好放到指定目录,这里特别指出需要放置的路径:

  • 如果是Linux环境
    cd ~/.keras/models
  • 如果是windows环境
    C:\Users\XXX\.keras\models\
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         (None, 150, 150, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 150, 150, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 150, 150, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 75, 75, 64)        0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 75, 75, 128)       73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 75, 75, 128)       147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 37, 37, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 37, 37, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 37, 37, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 37, 37, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 18, 18, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 18, 18, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 18, 18, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 18, 18, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 9, 9, 512)         0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 4, 4, 512)         0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 8192)              0         
=================================================================
Total params: 14,714,688
Trainable params: 0
Non-trainable params: 14,714,688
_________________________________________________________________

为了确认VGG-16各层是否真正被冻结,使用如下代码:

import pandas as pd
pd.set_option('max_colwidth', -1)

layers = [(layer, layer.name, layer.trainable) for layer in vgg_model.layers]
pd.DataFrame(layers, columns=['Layer Type', 'Layer Name', 'Layer Trainable'])

在这里插入图片描述

下面先提取bottleneck层的特征。

def get_bottleneck_features(model, input_imgs):
    
    features = model.predict(input_imgs, verbose=0)
    return features
  
train_features_vgg = get_bottleneck_features(vgg_model, train_imgs_scaled)
validation_features_vgg = get_bottleneck_features(vgg_model, validation_imgs_scaled)

print('Train Bottleneck Features:', train_features_vgg.shape, 
      '\tValidation Bottleneck Features:', validation_features_vgg.shape)

Train Bottleneck Features: (3000, 8192) Validation Bottleneck Features: (1000, 8192)

开始构建模型。

1.	from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, InputLayer  
2.	from keras.models import Sequential  
3.	from keras import optimizers  
4.	  
5.	input_shape = vgg_model.output_shape[1]  
6.	  
7.	model = Sequential()  
8.	model.add(InputLayer(input_shape=(input_shape,)))  
9.	model.add(Dense(512, activation='relu', input_dim=input_shape))  
10.	model.add(Dropout(0.3))  
11.	model.add(Dense(512, activation='relu'))  
12.	model.add(Dropout(0.3))  
13.	model.add(Dense(1, activation='sigmoid'))  
14.	  
15.	model.compile(loss='binary_crossentropy',  
16.	              optimizer=optimizers.RMSprop(lr=1e-4),  
17.	              metrics=['accuracy'])  
18.	  
19.	model.summary() 

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 512)               4194816   
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 512)               262656    
_________________________________________________________________
dropout_2 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 513       
=================================================================
Total params: 4,457,985
Trainable params: 4,457,985
Non-trainable params: 0

开始模型训练

1.	history = model.fit(x=train_features_vgg, y=train_labels_enc,  
2.	                    validation_data=(validation_features_vgg, validation_labels_enc),  
3.	                    batch_size=batch_size,  
4.	                    epochs=epochs,  
5.	                    verbose=1)  

Pretrained CNN model as a feature extractor with image augmentation

1.	train_datagen = ImageDataGenerator(rescale=1./255, zoom_range=0.3, rotation_range=50,  width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2,  horizontal_flip=True, fill_mode='nearest')  
2.	  
3.	val_datagen = ImageDataGenerator(rescale=1./255)  
4.	  
5.	train_generator = train_datagen.flow(train_imgs, train_labels_enc, batch_size=30)  
6.	val_generator = val_datagen.flow(validation_imgs, validation_labels_enc, batch_size=20)  

这次没有像上次那样提取瓶颈特征,因为我们将对数据生成器进行训练;因此,我们将vgg_model对象作为输入传递到我们自己的模型.

1.	from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, InputLayer  
2.	from keras.models import Sequential  
3.	from keras import optimizers  
4.	  
5.	model = Sequential()  
6.	model.add(vgg_model)  
7.	model.add(Dense(512, activation='relu', input_dim=input_shape))  
8.	model.add(Dropout(0.3))  
9.	model.add(Dense(512, activation='relu'))  
10.	model.add(Dropout(0.3))  
11.	model.add(Dense(1, activation='sigmoid'))  
12.	  
13.	model.compile(loss='binary_crossentropy',  
14.	              optimizer=optimizers.RMSprop(lr=2e-5),  
15.	              metrics=['accuracy'])  
16.	  
17.	model.summary()  

在这里插入图片描述

Pretrained CNN model with finetuning and image augmentation

冻结前三层,4-5层解冻

1.	vgg_model.trainable = True  
2.	  
3.	set_trainable = False  
4.	for layer in vgg_model.layers:  
5.	    if layer.name in ['block5_conv1', 'block4_conv1']:  
6.	        set_trainable = True  
7.	    if set_trainable:  
8.	        layer.trainable = True  
9.	    else:  
10.	        layer.trainable = False  
11.	          
12.	print("Trainable layers:", vgg_model.trainable_weights) 

1.	layers = [(layer, layer.name, layer.trainable) for layer in vgg_model.layers]  
2.	pd.DataFrame(layers, columns=['Layer Type', 'Layer Name', 'Layer Trainable']) 

在这里插入图片描述

1.	train_datagen = ImageDataGenerator(rescale=1./255, zoom_range=0.3, rotation_range=50,  width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2,   
2.	                                     horizontal_flip=True, fill_mode='nearest')  
3.	  
4.	val_datagen = ImageDataGenerator(rescale=1./255)  
5.	  
6.	train_generator = train_datagen.flow(train_imgs, train_labels_enc, batch_size=30)  
7.	val_generator = val_datagen.flow(validation_imgs, validation_labels_enc, batch_size=20)  
8.	  
9.	from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, InputLayer  
10.	from keras.models import Sequential  
11.	from keras import optimizers  
12.	  
13.	model = Sequential()  
14.	model.add(vgg_model)  
15.	model.add(Dense(512, activation='relu', input_dim=input_shape))  
16.	model.add(Dropout(0.3))  
17.	model.add(Dense(512, activation='relu'))  
18.	model.add(Dropout(0.3))  
19.	model.add(Dense(1, activation='sigmoid'))  
20.	  
21.	model.compile(loss='binary_crossentropy',  
22.	              optimizer=optimizers.RMSprop(lr=1e-5),  
23.	              metrics=['accuracy'])  
24.	  
25.	model.summary()  

在这里插入图片描述

Evaluating our deep learning models

Model predictions on a sample test image(单个样本的测试)

取了其中一张图片,使用了前面5个模型:
 Basic CNN
 CNN with Img Augmentation
 Pre-trained CNN (Transfer Learning)
 Pre-trained CNN with Img Augmentation (Transfer Learning)
 Pre-trained CNN with Fine-tuning & Img Augmentation (Transfer Learning)
3对2错

Evaluation model performance on test data

1.	basic_cnn = load_model('cats_dogs_basic_cnn.h5')  
2.	img_aug_cnn = load_model('cats_dogs_cnn_img_aug.h5')  
3.	tl_cnn = load_model('cats_dogs_tlearn_basic_cnn.h5')  
4.	tl_img_aug_cnn = load_model('cats_dogs_tlearn_img_aug_cnn.h5')  
5.	tl_img_aug_finetune_cnn = load_model('cats_dogs_tlearn_finetune_img_aug_cnn.h5')  
6.	predictions = basic_cnn.predict_classes(test_imgs_scaled, verbose=0)  
7.	predictions = num2class_label_transformer(predictions)  
8.	  
9.	meu.display_model_performance_metrics(true_labels=test_labels, predicted_labels=predictions,   
10.	                                      classes=list(set(test_labels)))  
11.	  
12.	predictions = img_aug_cnn.predict_classes(test_imgs_scaled, verbose=0)  
13.	predictions = num2class_label_transformer(predictions)  
14.	  
15.	meu.display_model_performance_metrics(true_labels=test_labels, predicted_labels=predictions,   
16.	                                      classes=list(set(test_labels)))  
17.	  
18.	test_bottleneck_features = get_bottleneck_features(vgg_model, test_imgs_scaled)  
19.	  
20.	predictions = tl_cnn.predict_classes(test_bottleneck_features, verbose=0)  
21.	predictions = num2class_label_transformer(predictions)  
22.	  
23.	meu.display_model_performance_metrics(true_labels=test_labels, predicted_labels=predictions,   
24.	                                      classes=list(set(test_labels)))  
25.	  
26.	predictions = tl_img_aug_cnn.predict_classes(test_imgs_scaled, verbose=0)  
27.	predictions = num2class_label_transformer(predictions)  
28.	  
29.	meu.display_model_performance_metrics(true_labels=test_labels, predicted_labels=predictions, classes=list(set(test_labels)))  
30.	  
31.	predictions = tl_img_aug_finetune_cnn.predict_classes(test_imgs_scaled, verbose=0)  
32.	predictions = num2class_label_transformer(predictions)  
33.	  
34.	meu.display_model_performance_metrics(true_labels=test_labels, predicted_labels=predictions,   
35.	                                      classes=list(set(test_labels)))  

最差的一个模型和最好的一个模型,分别是basic_cnn和Pre-trained CNN with Fine-tuning & Img Augmentation。
在这里插入图片描述

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Deep learning simplified by taking supervised, unsupervised, and reinforcement learning to the next level using the Python ecosystem Transfer learning is a machine learning (ML) technique where knowledge gained during training a set of problems can be used to solve other similar problems. The purpose of this book is two-fold; firstly, we focus on detailed coverage of deep learning (DL) and transfer learning, comparing and contrasting the two with easy-to-follow concepts and examples. The second area of focus is real-world examples and research problems using TensorFlow, Keras, and the Python ecosystem with hands-on examples. The book starts with the key essential concepts of ML and DL, followed by depiction and coverage of important DL architectures such as convolutional neural networks (CNNs), deep neural networks (DNNs), recurrent neural networks (RNNs), long short-term memory (LSTM), and capsule networks. Our focus then shifts to transfer learning concepts, such as model freezing, fine-tuning, pre-trained models including VGG, inception, ResNet, and how these systems perform better than DL models with practical examples. In the concluding chapters, we will focus on a multitude of real-world case studies and problems associated with areas such as computer vision, audio analysis and natural language processing (NLP). By the end of this book, you will be able to implement both DL and transfer learning principles in your own systems. What you will learn Set up your own DL environment with graphics processing unit (GPU) and Cloud support Delve into transfer learning principles with ML and DL models Explore various DL architectures, including CNN, LSTM, and capsule networks Learn about data and network representation and loss functions Get to grips with models and strategies in transfer learning Walk through potential challenges in building complex transfer learning models from scratch Explore real-world research problems related to computer vision and audio analysis Understand how transfer learning can be leveraged in NLP
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值