猫狗大战之迁移学习(4)

最新推荐文章于 2022-10-22 16:19:28 发布

俊俊岑

最新推荐文章于 2022-10-22 16:19:28 发布

阅读量977

点赞数 3

分类专栏：机器视觉文章标签：猫狗大战迁移学习

本文链接：https://blog.csdn.net/weixin_41459903/article/details/103096002

版权

机器视觉专栏收录该内容

9 篇文章 0 订阅

订阅专栏

前面分别用TensorFlow1.4和基于TF2.0的Keras分别实现了70%和85%的识别正确率，但是远远还不能用到大作业猫狗投食器的产品之中，因此继续使用基于VGG16的迁移学习来提升正确率至97%。

Transfer Learning using VGG16

Based on the accuracy result we have got it’s absolutely true that classic CNN model we have trained cannot be used to classify dog and cat effectively. The most important reason is thay our CNN model has too few convolutional layers and every convolutional layer we add has not enough filters which means thickness is not enough. In other words, our model is not deep enough so that our model cannot extract features of images very precisely. Of course we can build up a more deeper CNN model but it will take too much time. What’s more, set the hyperparameters such as learning rate, activation function, network structure, training steps are all difficult to get the best result as there are too many factors will influence the quality of network. So instead of training our own simple CNN model, we can directly use some trained CNN model like VGG16 to extract feature, then we add connected layers to use features extracted to classify dog and cat. This is called transfer learning.
在这里插入图片描述
Transfer learning can be divided into three tpyes: transfer learning, extract feature vector and fine-tune. I will use all of them to have a try. But first we have to know something about VGG16.

VGG16

VGG is proposed by Simonyan and Zisserman in Very Deep Convolutional Networks for Large Scale Image Recognition. This model in named from the abbreviation of Visual Geometry Group in Oxford. So the key words of title of this paper is very deep, as I mentioned before, deeper layer can extract higher level features of image so deep layers and more convolutional filters are very important. VGG16 ranks first in 2014 ImageNet competition so its ability to extract feature is very powerful. Let’s see the structure of VGG16.
在这里插入图片描述
The input of VGG16 should be 224*224 size and RGB channels. However, if input is not this kind of type VGG16 can also be used to extracted features but not very good result. Then there are 13 convolutional layers and corresponding max pooling layers to extract features. Finally, three fully connected layers and one softmax layer to classify 1000 different things. We can also use model.summary to see the structure in terminal like the following picture.

First line ‘Using TensorFlow backend’ means in this model Keras is based on TensorFlow. Keras can also based on Theano as I mentioned before, but here I install TensorFlow2.0.0 so TensorFlow is the backend.

Second linr ‘Model: vgg16’ means we load the VGG16 weights which are already trained to our model. Then we can see the structure of VGG16 we loaded layer by layer and corresponding parameters. So from the terminal we know that our model only load all convolutional layers and maxpooling layers of VGG16, fully connected layer and softmax layer are not loaded so we should add them myself to achieve our own goal.
在这里插入图片描述

Transfer Learning

Transfer learning is to freeze all layers of VGG16, then add some fully connected layer to classify what you want to classify. So let’s see the model structure:
在这里插入图片描述
We simply add one flatten layer to transfer the date of image in matrix tpye to list type so neural network can take over. Then a dense layer which is also a fully connected layer to classify the cat and dog. To avoid overfitting, we add a dropout layer. Finally a dense layer to tell us the result is cat or dog. Let’s see the result when training.
在这里插入图片描述
Let’s see the real accuracy of this transfer learning model:

This is what we really want. So we have to say the deeper network does work better. Let’s see the code.

'''Feature Extraction with Data Augmentation'''
from keras import models
from keras import layers
from keras.applications import VGG16
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
import os

base_dir='/Users/cenjun/Desktop/Sensor/catsanddogs_big'

train_dir=os.path.join(base_dir,'train')
test_dir=os.path.join(base_dir,'test')
validation_dir=os.path.join(base_dir,'validation')

conv_base=VGG16(weights='imagenet',
                include_top=False,
                input_shape=(224,224,3))
conv_base.trainable=False #Freeze VGG net

#Adding a densely connected classifier on top of the convolutional base
model=models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(256,activation='relu'))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1,activation='sigmoid'))

#Training the model end to end with a frozen convolutional base
#train_datagen=ImageDataGenerator(
#        rescale=1./255,
#        rotation_range=40,
#        width_shift_range=0.2,
#        height_shift_range=0.2,
#        shear_range=0.2,
#        zoom_range=0.2,
#        horizontal_flip=True,
#        fill_mode='nearest')
train_datagen=ImageDataGenerator(rescale=1./255)
test_datagen=ImageDataGenerator(rescale=1./255)

train_generator=train_datagen.flow_from_directory(
        train_dir,
        target_size=(224,224),
        batch_size=20,
        class_mode='binary')
validation_generator=test_datagen.flow_from_directory(
        validation_dir,
        target_size=(224,224),
        batch_size=20,
        class_mode='binary')

model.compile(loss='binary_crossentropy',
        optimizer=optimizers.RMSprop(lr=2e-5),
        metrics=['acc'])

history=model.fit_generator(
        train_generator,
        steps_per_epoch=100,
        epochs=100,
        validation_data=validation_generator,
        validation_steps=50)

model.save('cats_and_dogs_small_4_big.h5')

import matplotlib.pyplot as plt
acc=history.history['acc']
val_acc=history.history['val_acc']
loss=history.history['loss']
val_loss=history.history['val_loss']

epochs=range(1,len(acc)+1)

plt.plot(epochs,acc,'bo',label='Traning acc')
plt.plot(epochs,val_acc,'b',label='Validation acc')
plt.title('Training and Validation acc')
plt.legend()

plt.figure()

plt.plot(epochs,loss,'go',label='Training loss')
plt.plot(epochs,val_loss,'g',label='Validation loss')
plt.title('Training and Validation loss')
plt.legend()

plt.show()

Extract Feature Vector

We can also use VGG16 to extract all feature vectors first for all 25000 images, then we add some fully connected layers whose input is feature vector we have extracted. Actually this way is as same as transfer learning, but in this way training speed is much faster, so I recommand this way to train the model.

Let’s see if there is dropout layer, what the accuracy will be?
在这里插入图片描述
It seems like very good, let’s see if the accuracy of 25000 images is actually 100%.

Average accuracy is almost 0.97 which means it’s realiable, so we can use this trained model cats_and_dogs_small_3_big.h5 as our final model.

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
#import tensorflow as tf
from keras import layers
from keras.applications import VGG16
from keras.preprocessing.image import ImageDataGenerator
from keras import models
from keras import layers
from keras import optimizers
import numpy as np
from keras.callbacks import ReduceLROnPlateau

conv_base=VGG16(weights='imagenet',
                include_top=False,
                input_shape=(224,224,3))

conv_base.summary()

base_dir='/Users/cenjun/Desktop/Sensor/catsanddogs_big'

train_dir=os.path.join(base_dir,'train')
#test_dir=os.path.join(base_dir,'test')
validation_dir=os.path.join(base_dir,'validation')

datagen=ImageDataGenerator(rescale=1./255)
batch_size=20

#Extracting features using the pretrained convolutional network
def extract_feature(directory,sample_count):
    features=np.zeros(shape=(sample_count,7,7,512))
    labels=np.zeros(shape=(sample_count))
    generator=datagen.flow_from_directory(
        directory,
        target_size=(224,224),
        batch_size=batch_size,
        class_mode='binary')
    i=0
    for inputs_batch,labels_batch in generator:
        features_batch=conv_base.predict(inputs_batch)
        features[i*batch_size:(i+1)*batch_size]=features_batch
        labels[i*batch_size:(i+1)*batch_size]=labels_batch
        i+=1
        if i*batch_size >= sample_count:
            break
    return features,labels

train_features,train_labels=extract_feature(train_dir,25000)
validation_features,validation_labels=extract_feature(validation_dir,25000)
#test_features,test_labels=extract_feature(test_dir,2000)

train_features=np.reshape(train_features,(25000,7*7*512))
validation_features=np.reshape(validation_features,(25000,7*7*512))
#test_features=np.reshape(test_features,(2000,7*7*512))

#Defining and training the densely connected classifier
model=models.Sequential()
model.add(layers.Dense(256,activation='relu',input_dim=7*7*512))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1,activation='sigmoid'))

model.compile(optimizer=optimizers.RMSprop(lr=2e-5),
                loss='binary_crossentropy',
                metrics=['acc'])

reduce_lr = ReduceLROnPlateau(monitor='val_loss', patience=5, mode='auto')

history=model.fit(train_features,train_labels,
                epochs=50,
                batch_size=20,
                validation_data=(validation_features,validation_labels),
                callbacks=[reduce_lr])
                
model.save('cats_and_dogs_small_3_big.h5')

#Plotting the results
#import matplotlib
#matplotlib.use('TkAgg')
import matplotlib.pyplot as plt
acc=history.history['acc']
val_acc=history.history['val_acc']
loss=history.history['loss']
val_loss=history.history['val_loss']

epochs=range(1,len(acc)+1)

plt.plot(epochs,acc,'bo',label='Traning acc')
plt.plot(epochs,val_acc,'b',label='Validation acc')
plt.title('Training and Validation acc')
plt.legend()

plt.figure()

plt.plot(epochs,loss,'go',label='Training loss')
plt.plot(epochs,val_loss,'g',label='Validation loss')
plt.title('Training and Validation loss')
plt.legend()

plt.show()

As this model will be my final model, so I also put the test code here. This test procedure is a little bit different from others. As for other models, we can simply use model.predict() funtion to get the result easily, but here as we extract the features from VGG16 and only train the fully connected layers so in the test code we need to do the same way. Extract the feature from VGG16 first and then use fully connected layer to classify dog and cat.

import matplotlib.pyplot as plt
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
from keras import layers
from keras import models
from keras.applications import VGG16
from keras.models import load_model
from keras.preprocessing.image import ImageDataGenerator,load_img,array_to_img, img_to_array
import numpy as np
from PIL import Image
import cv2
from keras import backend as K

model = load_model('cats_and_dogs_small_3_big.h5')
model.summary()

conv_base=VGG16(weights='imagenet',
include_top=False,
input_shape=(224,224,3))

def get_one_image(train):
    files = os.listdir(train)
    n = len(files)
    ind = np.random.randint(0,n)
    img_dir = os.path.join(train,files[ind])
    name_test = img_dir.split(sep='.')
    if 'cat' in name_test[0]:
        animal=0
    else:
        if 'dog' in name_test[0]:
            animal=1
    image = Image.open(img_dir)
#    plt.imshow(image)
#    plt.show()
    image = image.resize([224, 224])
    image = np.array(image)
    return image,animal




predict_dir = '/Users/cenjun/Desktop/Sensor/data/train'


right=0
totalnum=1000
for i in (range(totalnum)):
  input,mark=get_one_image(predict_dir)
  y=input
#  print('type(inputs_batch): ',K.int_shape(y))
  y = y.reshape((1,) + y.shape)
#  print('type(inputs_batch): ',K.int_shape(y))
  features=conv_base.predict(y)
  train_features=np.reshape(features,(1,7*7*512))
#预测
  pre_res = model.predict(train_features)
  if pre_res>=0.5:
     pre_res=1
  else:
     pre_res=0
  print(mark)
  print(pre_res)
  if mark==pre_res:
     right+=1
print(right/totalnum)

Fine-tune

This is not similar as above two ways. Fine-tune means we have to unfreeze some layers of VGG16, then add some fully connected layers to train them all. For example, we can unfreeze the final convolutional layer block5_conv3 which is 14*14 and thickness is 512 with 2359808 parameters. In this way, in theory VGG16 we trained have stronger ability to extract features with respect to our images. But this one will take even more time. Let’s see the result:
在这里插入图片描述
The real accuracy:

So actually in this way the accuracy of our model becomes lower. As training accuracy and validation accuracy are all close to 100% so it’s also overfitting. But the influence of dropout layer is small when training database is not small as we have 25000 images, data augmentation will cost too much time that we cannot stand, so we choose the result of transfer learning and extract feature vector to be our final model. This is the code of fine-tune.

'''
Fine-tuning 97% test acc
    1.Add your custom network on top of an already-trained base network
    2.Freeze the base network
    3.Train the part you added
    4.Unfreeze some layers in the base network
    5.Jointly train both these layers and the part you added
'''
from keras import models
from keras import layers
from keras.applications import VGG16
from keras import optimizers
from keras.preprocessing.image import ImageDataGenerator
import os
from keras.callbacks import ReduceLROnPlateau

base_dir='/Users/cenjun/Desktop/Sensor/catsanddogs_big'

train_dir=os.path.join(base_dir,'train')
#test_dir=os.path.join(base_dir,'test')
validation_dir=os.path.join(base_dir,'validation')

conv_base=VGG16(weights='imagenet',
                include_top=False,
                input_shape=(224,224,3))
conv_base.trainable=False #Freeze VGG net

#Adding a densely connected classifier on top of the convolutional base
model=models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(256,activation='relu'))
model.add(layers.Dense(1,activation='sigmoid'))

#Training the model end to end with a frozen convolutional base
#train_datagen=ImageDataGenerator(
#        rescale=1./255,
#        rotation_range=40,
#        width_shift_range=0.2,
#        height_shift_range=0.2,
#        shear_range=0.2,
#        zoom_range=0.2,
#        horizontal_flip=True,
#        fill_mode='nearest')
train_datagen=ImageDataGenerator(rescale=1./255)
test_datagen=ImageDataGenerator(rescale=1./255)

train_generator=train_datagen.flow_from_directory(
        train_dir,
        target_size=(224,224),
        batch_size=20,
        class_mode='binary')
validation_generator=test_datagen.flow_from_directory(
        validation_dir,
        target_size=(224,224),
        batch_size=20,
        class_mode='binary')

model.compile(loss='binary_crossentropy',
        optimizer=optimizers.RMSprop(lr=2e-5),
        metrics=['acc'])
        
reduce_lr = ReduceLROnPlateau(monitor='val_loss', patience=5, mode='auto')

history=model.fit_generator(
        train_generator,
        steps_per_epoch=100,
        epochs=30,
        validation_data=validation_generator,
        validation_steps=50,
        callbacks=[reduce_lr])


'''PART TWO'''
#Freezing all layers up to a specific one
conv_base.trainable=True

set_trainable=False
for layer in conv_base.layers:
    if layer.name == 'block5_conv1':
        set_trainable=True
    if set_trainable:
        layer.trainable=True
    else:
        layer.trainable=False

#Fine-tuning the model
model.compile(loss='binary_crossentropy',
                optimizer=optimizers.RMSprop(lr=1e-5),
                metrics=['acc'])
history=model.fit_generator(
                train_generator,
                steps_per_epoch=100,
                epochs=100,
                validation_data=validation_generator,
                validation_steps=50)

#Saving the model
model.save('cats_and_dogs_small_5_224.h5')

import matplotlib.pyplot as plt
#Display curves of loss and accuracy during training
acc=history.history['acc']
val_acc=history.history['val_acc']
loss=history.history['loss']
val_loss=history.history['val_loss']

epochs=range(1,len(acc)+1)

plt.plot(epochs,acc,'bo',label='Traning acc')
plt.plot(epochs,val_acc,'b',label='Validation acc')
plt.title('Training and Validation acc')
plt.legend()

plt.figure()

plt.plot(epochs,loss,'go',label='Training loss')
plt.plot(epochs,val_loss,'g',label='Validation loss')
plt.title('Training and Validation loss')
plt.legend()

plt.show()

俊俊岑

关注

3
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
猫狗大战之迁移学习(4)

前面分别用TensorFlow1.4和基于TF2.0的Keras分别实现了70%和85%的识别正确率，但是远远还不能用到大作业猫狗投食器的产品之中，因此继续使用基于VGG16的迁移学习来提升正确率至97%。Transfer Learning using VGG16Based on the accuracy result we have got it’s absolutely true that...
复制链接

扫一扫