根据《TensorFlow实战Google深度学习框架》里对Inception-v3模型迁移学习示例,修改自己的代码,完成迁移学习。
什么是迁移学习
所谓迁移学习,就是将一个问题上训练好的模型通过简单的调整使其适用于一个新的问题。通常应用于训练数据少的情况。一般来说,在数据量足够的情况下,迁移学习的效果不如完全重新训练。但是迁移学习所需要的训练时间和训练样本数要远远小于训练完整的样本。
实现迁移学习
书中样例保留了训练好的Inception-v3模型中所有卷积层的参数,只是替换最后一层全连接层。将最后一层连接层之前的网络层称之为瓶颈层(bottleneck)。本文所用模型为四层一维卷积层和两层连接层所组成的一维卷积神经网络,将最后一层全连接层替换,完成迁移学习。
保存模型
由于我们只需要从输入层开始经过前向传播计算得到瓶颈层最后的输出结果,而不需要类似于变量初始化、模型保存等辅助节点的信息。TensorFlow提供了convert_variables_to_constants函数,通过这个函数将计算图中的变量及其取值通过常量的方式保存,这样整个TensorFlow计算图可以统一保存在一个文件中。
# Save model
# 导出当前计算图的GraphDef部分,只需要这一部分就可以完成从输入层到输出层的计算过程
graph_def = tf.get_default_graph().as_graph_def()
# 将图中的变量及其取值转化为常量,同时保存了需要的节点名称
# 本文是将最后一层全连接层替换,所以保存的节点就是第一层连接层最后的计算节点,替换更多层也是类似
# 需要查看节点名称可以通过print(tensor.name)查看
output_graph_def = graph_util.convert_variables_to_constants(sess, graph_def, ['dropout_1/mul'])
# 将导出的模型存入文件
with tf.gfile.GFile(model_dir + 'model.pb', 'wb') as f:
f.write(output_graph_def.SerializeToString())
- 迁移学习
# -*- coding:utf-8 -*-
"""
@author: wang ziyang
@file: transfer_learning.py
@time: 2018/8/29
"""
from scipy.io import loadmat as load
import os.path
import numpy as np
import tensorflow as tf
from tensorflow.python.framework import graph_util
from tensorflow.python.platform import gfile
train = load('/home/amax/Conv1d/spindle_data/train_unload_13k.mat')
test = load('/home/amax/Conv1d/spindle_data/test_unload_13k.mat')
trX = train['train_data']
trY = train['train_label']
teX = test['test_data']
teY = test['test_label']
#prepare dataset
trX = np.reshape(trX, [-1, 800, 1])
teX = np.reshape(teX, [-1, 800, 1])
BOTTLENECK_TENSOR_SIZE = 500
BOTTLENECK_TENSOR_NAME = 'dropout_1/mul:0'
DATA_TENSOR_NAME = 'inputs:0'
KEEP_PROB = 'keep_prob:0'
MODEL_DIR = 'model'
MODEL_FILE = 'model.pb'
# hyperparameters
batch_size = 128 # Batch size
seq_len = 800
learning_rate = 0.1
beta1 = 0.9
beta2 = 0.999
epochs = 5
n_classes = 4
n_channels = 1
# load model
with gfile.FastGFile(os.path.join(MODEL_DIR, MODEL_FILE), 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
bottleneck_tensor, data_tensor, keep_prob = tf.import_graph_def(graph_def,
return_elements=[BOTTLENECK_TENSOR_NAME, DATA_TENSOR_NAME, KEEP_PROB])
# 重新构造最后一层
bottleneck_input = tf.placeholder(tf.float64, [None, BOTTLENECK_TENSOR_SIZE], name='bottleneck_input')
labels = tf.placeholder(tf.float64, [None, n_classes], name='labels')
weights = tf.Variable(tf.truncated_normal([BOTTLENECK_TENSOR_SIZE, n_classes], stddev= 0.05, dtype=tf.float64))
bias = tf.Variable(tf.constant(0.0, shape=[n_classes], dtype=tf.float64))
logits = tf.add(tf.matmul(bottleneck_input, weights), bias)
# loss
with tf.name_scope("loss"):
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=labels))
tf.summary.scalar("loss", cross_entropy)
train_op = tf.train.AdamOptimizer(learning_rate, beta1, beta2, epsilon=1e-08).minimize(cross_entropy)
# calculate accuracy
with tf.name_scope("accuracy"):
accuracy = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(labels, axis=1), tf.argmax(logits, axis=1)), tf.float32))
tf.summary.scalar("accuracy", accuracy)
with tf.Session() as sess:
train_writer = tf.summary.FileWriter('logs_TL/train', sess.graph)
test_writer = tf.summary.FileWriter('logs_TL/test', sess.graph)
merge = tf.summary.merge_all()
# Initialize variables
sess.run(tf.global_variables_initializer())
for i in range(epochs):
# Shuffle the training data
permutation = np.random.permutation(trX.shape[0])
trX_shuffled = trX[permutation, :, :]
trY_shuffled = trY[permutation]
# 计算瓶颈层输出,用于计算准确率
train_bottlenecks = sess.run(bottleneck_tensor, feed_dict={data_tensor: trX_shuffled, keep_prob: 1})
test_bottlenecks = sess.run(bottleneck_tensor, feed_dict={data_tensor: teX, keep_prob: 1})
for start, end in zip(
range(0, len(trX_shuffled), batch_size), range(batch_size, len(trX_shuffled), batch_size)):
# 计算瓶颈层输出,用于训练最后一层
train_batch_bottlenecks = sess.run(bottleneck_tensor, feed_dict={data_tensor: trX_shuffled[start:end], keep_prob: 0.8})
sess.run(train_op, feed_dict={bottleneck_input: train_bottlenecks[start:end], labels: trY_shuffled[start:end]})
acc, summary = sess.run([accuracy, merge], feed_dict={bottleneck_input: train_bottlenecks, labels: trY_shuffled})
print("Accuracy rating for epoch " + str(i) + ": " + str(acc))
train_writer.add_summary(summary, i)
acc1, summary1 = sess.run([accuracy, merge], feed_dict={bottleneck_input: test_bottlenecks, labels: teY})
print("Test accuracy for epoch " + str(i) + ": " + str(acc1))
test_writer.add_summary(summary1, i)
train_writer.close()
test_writer.close()
Accuracy rating for epoch 0: 0.925375
Test accuracy for epoch 0: 0.914125
Accuracy rating for epoch 1: 0.93225
Test accuracy for epoch 1: 0.916125
Accuracy rating for epoch 2: 0.935719
Test accuracy for epoch 2: 0.915125
Accuracy rating for epoch 3: 0.938156
Test accuracy for epoch 3: 0.914125
Accuracy rating for epoch 4: 0.933781
Test accuracy for epoch 4: 0.923
Process finished with exit code 0