分别基于Tensorflow和Keras加载预训练模型实现迁移学习
1.什么迁移学习和预训练模型?
迁移学习(Tranfer Learning):神经网络可以从一个任务中习得知识,并将这些知识应用到另一个独立的任务中。所以例如,也许你已经训练好一个神经网络,能够识别像猫这样的对象,然后使用那些知识,或者部分习得的知识去帮助您更好地阅读x射线扫描图,这就是所谓的迁移学习。
预训练(pre-training):如果你有一个小数据集,就只训练输出层前的最后一层,或者也许是最后一两层。但是如果你有很多数据,那么也许你可以重新训练网络中的所有参数。如果你重新训练神经网络中的所有参数,那么这个在图像识别数据的初期训练阶段,有时称为预训练(pre-training),而其所保存的模型就是预训练模型(pre-training model)。
**微调(fine tuning):**当我们使用预训练神经网络的权重,然后如果你以后更新所有权重,那么这个过程就是微调。
2.实现迁移学习的方法及使用场景
(1)特征提取:
我们可以将预训练模型当做特征提取装置来使用。具体的做法是,将输出层或者全连接层去掉,然后将剩下的整个网络当做一个固定的特征提取机,然后应用到我们的模型中。
**适用情景:**数据集小,数据相识度较高。在这种情况下,因为数据与预训练模型的训练数据相似度很高,因此我们不需要重新训练模型,我们使用预处理模型作为模式提取器。
(2)训练特定层,冻结其它层:
具体的做法是,将模型起始的一些层的权重保持不变,重新训练后面的层,得到新的权重。
**适用场景:**数据集小,数据相识度不高。在这种情况下,我们可以冻结预训练模型中的前k个层中的权重,然后重新训练后面的n-k个层,当然最后一层也需要根据相应的输出格式来进行修改。
3.python代码实现
这里我们用vgg16作为我们的预训练模型。
3.1特征提取
3.1.1基于keras实现
这里用mnist数据集实现手写数字体识别为例。
(1)特征提取:
def extract_features(input_x):
m = input_x.shape[0]
#include_top=False 去掉表示去掉vgg16网络后面的全连接层
model = VGG16(weights='imagenet', include_top=False)
#然后进行预测输出不包含全连接层提取的特征
features_train = model.predict(input_x)
train_x = features_train.reshape(m, -1)
return train_x
这里我们是通过keras加载vgg16的模型,但是在本机上比较慢,可以现在网上下载好模型,然后放到C:\Users\Administrator.keras的文件下,每个人可能不一样。
下载地址:https://github.com/fchollet/deep-learning-models/releases
这里我们可以对比一下去掉全连接层的网络和未去掉的:
from keras.applications.vgg16 import VGG16
#去掉全连接层
model = VGG16(weights='imagenet', include_top=False)
model.summary()
#完整的vgg16网络
model = VGG16(weights='imagenet')
model.summary()
输出结果太长这里就不放出来了。
(2)创建我们自己的网络,是前面vgg16卷积层的输出作为我们网络的输入。
注意当全连接层作为第一层的时候,需要指明输入维度input_dim,这个根据我们特征提取的输出维度而决定。
def my_model(num_classes = 10):
model = Sequential()
model.add(Dense(1000, input_dim=2048, activation='relu', kernel_initializer='uniform'))
model.add(Dropout(0.3))
model.add(Dense(500, input_dim=1000, activation='sigmoid'))
model.add(Dropout(0.4))
model.add(Dense(150, input_dim=500, activation='sigmoid'))
model.add(Dropout(0.2))
model.add(Dense(units=num_classes))
model.add(Activation('softmax'))
return model
完整代码:
from keras.models import Sequential, load_model
import matplotlib.pyplot as plt
import numpy as np
import keras
from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
import numpy as np
from keras.layers import Dense, Activation, Dropout, Flatten,Input
from sklearn.model_selection import train_test_split
from pretrain_project.utils import load_grabage_dataset, convert_to_one_hot
import cv2
def my_model(num_classes = 10):
model = Sequential()
model.add(Dense(1000, input_dim= 4608,activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(500, activation='relu'))
model.add(Dropout(0.4))
model.add(Dense(150, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(units=num_classes))
model.add(Activation('softmax'))
return model
def extract_features(input_x):
# include_top=False 去掉顶部的全连接层
m = input_x.shape[0]
model = VGG16(weights='imagenet', include_top=False)
# model.summary()
features_train = model.predict(input_x)
train_x = features_train.reshape(m, -1)
return train_x
def predict(input_x):
input_x = input_x.reshape(-1,96,96,3)
input_x = extract_features(input_x)
model = load_model("model/garbage_model.h5")
preds_prob = model.predict(input_x)
print(preds_prob)
preds = np.argmax(preds_prob, 1)[0]
print("预测结果:",grabage_classes[preds])
# 加载训练集
from keras.datasets import mnist
(train_x, train_y), (test_x, test_y) = mnist.load_data()
#将灰度图中的单通道转为RGB多通道
train_x = np.array(train_x, dtype="uint8")
train_x = [cv2.cvtColor(cv2.resize(x, (64, 64)), cv2.COLOR_GRAY2BGR) for x in train_x]
train_x = np.concatenate([arr[np.newaxis] for arr in train_x]).astype('float32')
train_y = train_y.reshape(len(train_y),1).astype(int)
train_y = convert_to_one_hot(train_y,10)
#进行特征提取
train_x = extract_features(train_x)
print(train_x.shape)
#训练模型
model = my_model(num_classes=10)
model.compile(loss='categorical_crossentropy', optimizer="adam", metrics=['accuracy'])
model.fit(train_x, train_y, epochs=20, batch_size=64)
model.save("model/my_model.h5")
我们这里把图片的大小放大为64x64,因为28x28太小了,经过卷积层后维度就成负了
运行结果(这里省略了一部分,太多了):
Epoch 16/20
1000/1000 [==============================] - 0s 83us/step - loss: 0.0739 - acc: 0.9800
Epoch 17/20
1000/1000 [==============================] - 0s 81us/step - loss: 0.0712 - acc: 0.9790
Epoch 18/20
1000/1000 [==============================] - 0s 83us/step - loss: 0.0698 - acc: 0.9820
Epoch 19/20
1000/1000 [==============================] - 0s 93us/step - loss: 0.0632 - acc: 0.9870
Epoch 20/20
1000/1000 [==============================] - 0s 90us/step - loss: 0.0514 - acc: 0.9860
3.1.2基于tensorflow实现
总体来说基于keras的实现更加简洁,省去了前向传播的代码,我们只需要搭建好网络即可,而tensorflow相对于keras实现的代码会更多一些,但是作为都是很流行的框架,因此还是有比较都掌握的。
首先我们需要下载vgg权重文件,因为tensorflow没有向keras那样提供直接的方法。
预训练模型下载链接:(https://github.com/tensorflow/models/tree/1af55e018eebce03fb61bba9959a04672536107d/research/slim)
(1)首先先构建我们的vgg16网络结构,因为我并没有找到vgg16的.meta文件(即网络结构),我们下载的只有.ckpt权重文件,具体的网络结构也可以通过点击上图的Code源码进行查看。
def vgg16(inputs):
with slim.arg_scope([slim.conv2d, slim.fully_connected],activation_fn=tf.nn.relu,weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),weights_regularizer=slim.l2_regularizer(0.0005)):
net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')
net = slim.max_pool2d(net1, [2, 2], scope='pool1')
net = slim.repeat(net2, 2, slim.conv2d, 128, [3, 3], scope='conv2')
net = slim.max_pool2d(net3, [2, 2], scope='pool2')
net = slim.repeat(net4, 3, slim.conv2d, 256, [3, 3], scope='conv3')
net = slim.max_pool2d(net5, [2, 2], scope='pool3')
net = slim.repeat(net6, 3, slim.conv2d, 512, [3, 3], scope='conv4')
net = slim.max_pool2d(net7, [2, 2], scope='pool4')
net = slim.repeat(net8, 3, slim.conv2d, 512, [3, 3], scope='conv5')
net = slim.max_pool2d(net9, [2, 2], scope='pool5')
return net
补充1:
tf.contrib.slim他是可以帮助我们更加快捷的搭建我们的网络结构,例如上面我们的vgg16网络结构,中间可能涉及连续的卷积层,按照我们常规的写法,则我们的代码会存在很多重复,而使用slim则会简洁很多。
例如net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope=‘conv1’),第一个参数是我们的输入,第二个参数是我们重复的次数,第三个则是我们要用的层,第四个和第五个参数是我们重复的每一层的卷积核数量和大小,第五个就是scope,可用于指定这一层的变量的范围。
补充2:
关于我们构建的网络是怎么从权重文件中获取对应权重的呢?这个就是变量名来决定的,至于这个变量名我们则需要查看上图中的Code源码进行查看,这就是我们在定义模型时指定scope的原因,就是为了和权重文件中对应权重的名称匹配。
slim.arg_scope函数的作用主要就是为目标函数(上面是slim.conv2d, slim.fully_connected)设置默认超参数(activation_fn等…),因为我们的卷积层中可能会有重复的超参数,这样在slim.arg_scope就可以简化我们的代码,如果我们某一层不需要默认的参数,则是需要在那一层重新传入即可。
(2)特征提取:
def extract_features(inputs):
input_image = tf.placeholder(tf.float32, shape=[None, None, None, 3], name='input_image')
with tf.variable_scope('vgg_16', reuse=tf.AUTO_REUSE):
model = vgg16(input_image)
variable_restore_op = slim.assign_from_checkpoint_fn("pretrain_models/vgg_16.ckpt",
slim.get_trainable_variables(),
ignore_missing_vars=True)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
variable_restore_op(sess)
feature_x = sess.run(model, feed_dict={input_image:inputs})
return feature_x
这里就是导入我们的预训练模型的权重,slim.assign_from_checkpoint_fn的第一个参数就是我们权重文件的路径,第二个参数是我们要把权重文件中的权重恢复到哪些变量上面,第三个参数ignore_missing_vars设为true的意义是忽略那些在定义的模型结构中可能存在的而在预训练模型中没有的变量。
然后通过传进来的数据集,在sess下运行,输出我们通过卷积层提取到的特征。
(3)拿到特征后,接着需要构建我们自己的模型,这就是基本的用TensorFlow训练模型的步骤,w我们只需要把经过上面特征提取的数据集作为模型的输入进行训练即可。可以看一下上一篇博客基于tensorflow的手写数字体识别:
def create_placeholders(n_H0, n_W0, n_C0, n_y):
X = tf.placeholder(tf.float32, [None, n_H0, n_W0, n_C0])
Y = tf.placeholder(tf.float32, [None, n_y])
return X, Y
def forward_propagation(inputs):
a = tf.contrib.layers.flatten(inputs)
#fc1
a1 = tf.contrib.layers.fully_connected(a, 4608)
ad1 = tf.nn.dropout(a1,0.5)
#fc2
a2 = tf.contrib.layers.fully_connected(ad1, 1000)
ad2 = tf.nn.dropout(a2, 0.5)
#outputs
z3 = tf.contrib.layers.fully_connected(ad2, 10, activation_fn=None)
return z3
def compute_cost(Z3, Y):
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=Z3,labels=Y))
return cost
def my_model(X_train, Y_train, X_test, Y_test, learning_rate=0.0001,num_epochs=30,minibatch_size=64,print_cost=True,isPlot=True):
tf.reset_default_graph()
tf.set_random_seed(1)
seed = 3
(m, n_H0, n_W0, n_C0) = X_train.shape
n_y = Y_train.shape[1]
costs = []
X, Y = create_placeholders(n_H0,n_W0,n_C0,n_y)
Z3 = forward_propagation(X)
cost = compute_cost(Z3, Y)
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
init = tf.global_variables_initializer()
saver = tf.train.Saver()
total_time = 0
with tf.Session() as sess:
sess.run(init)
for epoch in range(1,num_epochs+1):
#每一迭代的开始时间
start_time = time.clock()
#每一次迭代中所有batch的代价,即总的代价
minibatches_cost = 0
num_minibatches = int(m / minibatch_size)
#将我们的数据随机打乱并根据batch_size随机分成若干个batch
minibatches = random_mini_batches(X_train, Y_train, minibatch_size, seed)
for minibatch in minibatches:
(minibatch_X, minibatch_Y) = minibatch
_, temp_cost = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y:minibatch_Y})
minibatches_cost += temp_cost/num_minibatches
end_time = time.clock()
total_time += (end_time - start_time)
if print_cost:
if epoch % 10 == 0:
print("当前是第 " + str(epoch) + " 代,成本值为:" + str(minibatches_cost) + " ; 每一个epoch花费时间:" + str(
end_time - start_time) + " 秒,10个epoch总的时间:" + str(total_time))
total_time = 0
if epoch % 10 == 0:
costs.append(minibatches_cost)
saver.save(sess, "model_tf/my-model")
if isPlot:
plt.plot(np.squeeze(costs))
plt.ylabel("cost")
plt.xlabel("iterations (per tens)")
plt.title("Learning rate =" + str(learning_rate))
plt.show()
predict_op = tf.argmax(Z3, 1)
corrent_prediction = tf.equal(predict_op, tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(corrent_prediction, "float"))
train_accuracy = accuracy.eval({X: X_train, Y: Y_train})
test_accuracy = accuracy.eval({X: X_test, Y: Y_test})
print("训练集准确度:" + str(train_accuracy))
print("测试及准确度:" + str(test_accuracy))
return (train_accuracy, test_accuracy)
3.2训练特定层,冻结其它层
3.2.1基于keras的实现
先看一下完整代码:
import matplotlib.pyplot as plt
import numpy as np
import keras
from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
from keras.applications.vgg16 import decode_predictions
from keras.utils.np_utils import to_categorical
from sklearn.preprocessing import LabelEncoder
from keras.models import Sequential
from keras.optimizers import SGD
from keras.layers import Input, Dense, Convolution2D, MaxPooling2D
from sklearn.metrics import log_loss
from keras.models import Model
import cv2
def convert_to_one_hot(Y, C):
Y = np.eye(C)[Y.reshape(-1)]
return Y
def vgg16_model( num_classes=None):
model = VGG16(weights='imagenet', include_top=True)
model.layers.pop()
#获取最后一层的输出
#修改模型的输出为当前的最后一层,因为前面已经把模型的输出层pop掉了
output = model.layers[-1].output
#添加softmax层
x = Dense(num_classes, activation='softmax')(output)
model = Model(inputs=model.input, outputs=x)
#冻结前8层,不训练
for layer in model.layers[:8]:
layer.trainable = False
sgd = SGD(lr=1e-3, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd, loss='categorical_crossentropy', metrics=['accuracy'])
return model
#加载数据集
from keras.datasets import mnist
(train_x, train_y), (test_x, test_y) = mnist.load_data()
train_x = train_x[:10000]
train_y = train_y[:10000]
train_x = np.array(train_x, dtype="uint8")
#将灰度图中的单通道转为RGB多通道
train_x = [cv2.cvtColor(cv2.resize(x, (224, 224)), cv2.COLOR_GRAY2BGR) for x in train_x]
train_x = np.concatenate([arr[np.newaxis] for arr in train_x]).astype('float32')
train_y = train_y.reshape(len(train_y),1).astype(int)
train_y = convert_to_one_hot(train_y,10)
print("训练集:")
print(train_x.shape)
print(train_y.shape)
model = vgg16_model(num_classes=10)
model.fit(train_x, train_y, batch_size=64, epochs=20, shuffle=True)
# model.save("minist_model/mnist_model.h5")
和3.1中的不一样,在这里我们导入的是包含全连接层的完整的vgg16,在这里我们只是冻结网络的前8层不训练,后面的依然用预训练模型的权重进行初始化。
这里需要注意的一点:
使用完整的网络结构,冻结部分层,训练其它层这种方法和特征提取不一样,特征提取我们的输入的图片大小可以任意(只要满足经过所有卷积层后大小不为负就行);而冻结部分层这种方法因为我们使用的完整的网络结构,需要训练的层用到了预训练模型的权重进行初始化,如果我们输入的维度和预训练模型的维度不同,那么在经过全连接层的时候,输入和全连接层的权重维度匹配就会出错。
所以在这里我们需要把图片转为223x224x3
3.2.2基于tensorflow的实现
先看完整代码:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import time
slim = tf.contrib.slim
import os
import h5py
import math
from keras.preprocessing import image
from keras.applications.imagenet_utils import preprocess_input
import cv2
import random
def convert_to_one_hot(Y, C):
Y = np.eye(C)[Y.reshape(-1)]
return Y
def random_mini_batches(X, Y, mini_batch_size = 64, seed = 0):
m = X.shape[0] # number of training examples
mini_batches = []
np.random.seed(seed)
# Step 1: Shuffle (X, Y)
permutation = list(np.random.permutation(m))
shuffled_X = X[permutation,:,:,:]
shuffled_Y = Y[permutation,:]
# Step 2: Partition (shuffled_X, shuffled_Y). Minus the end case.
num_complete_minibatches = math.floor(m/mini_batch_size) # number of mini batches of size mini_batch_size in your partitionning
for k in range(0, num_complete_minibatches):
mini_batch_X = shuffled_X[k * mini_batch_size : k * mini_batch_size + mini_batch_size,:,:,:]
mini_batch_Y = shuffled_Y[k * mini_batch_size : k * mini_batch_size + mini_batch_size,:]
mini_batch = (mini_batch_X, mini_batch_Y)
mini_batches.append(mini_batch)
# Handling the end case (last mini-batch < mini_batch_size)
if m % mini_batch_size != 0:
mini_batch_X = shuffled_X[num_complete_minibatches * mini_batch_size : m,:,:,:]
mini_batch_Y = shuffled_Y[num_complete_minibatches * mini_batch_size : m,:]
mini_batch = (mini_batch_X, mini_batch_Y)
mini_batches.append(mini_batch)
return mini_batches
def vgg16(inputs):
with slim.arg_scope([slim.conv2d, slim.fully_connected],activation_fn=tf.nn.relu,weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),weights_regularizer=slim.l2_regularizer(0.0005)):
net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')
net = slim.max_pool2d(net, [2, 2], scope='pool1')
net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')
net = slim.max_pool2d(net, [2, 2], scope='pool2')
net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
net = slim.max_pool2d(net, [2, 2], scope='pool3')
net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')
net = slim.max_pool2d(net, [2, 2], scope='pool4')
net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')
net = slim.max_pool2d(net, [2, 2], scope='pool5')
# net = tf.contrib.layers.flatten(net)
# net = slim.fully_connected(net, 4096, scope='fc6')
net = slim.conv2d(net, 4096, [7, 7], padding="VALID", scope='fc6')
net = slim.dropout(net, 0.5, scope='dropout6')
# net = slim.fully_connected(net, 4096, scope='fc7')
net = slim.conv2d(net, 4096, [1, 1], scope='fc7')
net = slim.dropout(net, 0.5, scope='dropout7')
net = slim.conv2d(net, 10,[1, 1], activation_fn=None, scope='fc8')
net = tf.squeeze(net, [1, 2])
return net
def load_weights(input_image):
# input_image = tf.placeholder(tf.float32, shape=[None, 96, 96, 3], name='input_image')
with tf.variable_scope('vgg_16', reuse=tf.AUTO_REUSE):
net = vgg16(input_image)
variables = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)
length = len(variables)
parameters = []
for i in range(6):
parameters.append(variables[length - 6 + i])
variable_restore_op = slim.assign_from_checkpoint_fn("pretrain_models/vgg_16.ckpt",variables[:length-2],ignore_missing_vars=True)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
variable_restore_op(sess)
print(parameters)
return net, parameters
def create_placeholders(n_H0, n_W0, n_C0, n_y):
X = tf.placeholder(tf.float32, shape=[None, 224, 224, 3])
Y = tf.placeholder(tf.float32, [None, n_y])
return X, Y
def compute_cost(Z3, Y):
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=Z3,labels=Y))
return cost
def my_model(X_train, Y_train, X_test, Y_test, learning_rate=0.0001,num_epochs=30,minibatch_size=64,print_cost=True,isPlot=True):
tf.reset_default_graph()
tf.set_random_seed(1)
seed = 3
(m, n_H0, n_W0, n_C0) = X_train.shape
n_y = Y_train.shape[1]
costs = []
X, Y = create_placeholders(n_H0,n_W0,n_C0,n_y)
Z, parameters = load_weights(X)
cost = compute_cost(Z, Y)
#只训练var_list中的权重参数
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost, var_list=parameters )
print(parameters)
print("begin training...")
init = tf.global_variables_initializer()
saver = tf.train.Saver()
total_time = 0
with tf.Session() as sess:
sess.run(init)
for epoch in range(1,num_epochs+1):
#每一迭代的开始时间
start_time = time.clock()
#每一次迭代中所有batch的代价,即总的代价
minibatches_cost = 0
num_minibatches = int(m / minibatch_size)
#将我们的数据随机打乱并根据batch_size随机分成若干个batch
minibatches = random_mini_batches(X_train, Y_train, minibatch_size, seed)
for minibatch in minibatches:
(minibatch_X, minibatch_Y) = minibatch
_, temp_cost = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y:minibatch_Y})
minibatches_cost += temp_cost/num_minibatches
end_time = time.clock()
total_time += (end_time - start_time)
if print_cost:
if epoch % 5 == 0:
print("当前是第 " + str(epoch) + " 代,成本值为:" + str(minibatches_cost) + " ; 每一个epoch花费时间:" + str(
end_time - start_time) + " 秒,5个epoch总的时间:" + str(total_time))
total_time = 0
if epoch % 5 == 0:
costs.append(minibatches_cost)
saver.save(sess, "model_tf/my-model")
if isPlot:
plt.plot(np.squeeze(costs))
plt.ylabel("cost")
plt.xlabel("iterations (per tens)")
plt.title("Learning rate =" + str(learning_rate))
plt.show()
predict_op = tf.argmax(Z, 1)
corrent_prediction = tf.equal(predict_op, tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(corrent_prediction, "float"))
train_accuracy = accuracy.eval({X: X_train, Y: Y_train})
test_accuracy = accuracy.eval({X: X_test, Y: Y_test})
print("训练集准确度:" + str(train_accuracy))
print("测试及准确度:" + str(test_accuracy))
return (train_accuracy, test_accuracy)
from keras.datasets import mnist
(train_x, train_y), (test_x, test_y) = mnist.load_data()
train_x = train_x[:1000]
train_y = train_y[:1000]
test_x = test_x[:500]
test_y = test_y[:500]
train_x = np.array(train_x, dtype="uint8")
test_x = np.array(test_x, dtype="uint8")
#将灰度图中的单通道转为RGB多通道
train_x = [cv2.cvtColor(cv2.resize(x, (224, 224)), cv2.COLOR_GRAY2BGR) for x in train_x]
train_x = np.concatenate([arr[np.newaxis] for arr in train_x]).astype('float32')
test_x = [cv2.cvtColor(cv2.resize(x, (224, 224)), cv2.COLOR_GRAY2BGR) for x in test_x]
test_x = np.concatenate([arr[np.newaxis] for arr in test_x]).astype('float32')
train_y = train_y.reshape(len(train_y),1).astype(int)
test_y = test_y.reshape(len(test_y),1).astype(int)
train_y = convert_to_one_hot(train_y,10)
test_y = convert_to_one_hot(test_y, 10)
print("训练集:")
print(train_x.shape)
print(train_y.shape)
print("测试集:")
print(test_x.shape)
print(test_y.shape)
#训练模型
my_model(train_x, train_y, test_x, test_y)
(1)和3.1中基于tensorflow的实现,主要差别在于我们需要我们构建的vgg16网络结构是包含全连接层的,其中一点不同的是我们需要把输出层的神经元修改为我们需要的类别数。
(2)在使用slim.assign_from_checkpoint_fn恢复权重的时候我们恢复除输出层的其它层的权重,因为我们已经修改了输出层的神经元数目,因此指定恢复变量为variables[:length-2],因为最后一层包括weights和bias两个变量,所以是减2。
(3)关于冻结部分层的做法:我们只需要在指定优化器的时候,将我们需要训练的层的变量作为var_list参数传进去,而在本例子中我们需要训练的层是后面的全连接层,因此在load_weights函数中我们返回了全连接层的变量。
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost,var_list=parameters)
参考博客:
tf.contrib.slim简介
Tensorflow如何直接使用预训练模型(vgg16为例)
一文看懂迁移学习:怎样用预训练模型搞定深度学习?