模型持久化的目的在于可以使模型训练后的结果重复使用,保存模型和还原模型是模型持久化的主要任务.
train.Saver类是保存和还原模型的API.
持久化两个向量相加的例子的代码:
import tensorflow as tf
#声明两个变量并计算其加和
a = tf.Variable(tf.constant([1.0,2.0],shape=[2]), name="a")
b = tf.Variable(tf.constant([3.0,4.0],shape=[2]), name="b")
result=a+b
#定义初始化全部变量的操作
init_op=tf.initialize_all_variables()
#定义Saver类对象用于保存模型
saver=tf.train.Saver()
with tf.Session() as sess:
sess.run(init_op)
# 模型保存到model路径下的model.ckpt文件,其中model是模型的名称
saver.save(sess,"model/model.ckpt")
# save函数的原型是
# save(self,ses,save_path,global_step,latest_filename,meta_graph_suffix,
# write_meta_graph, write_state)
持久化后model目录下生成的文件:
checkpoint是一个文本文件.其余三个全是二进制文件,分别保存了tensorflow程序中每个变量的取值,名称和计算图结构.
使用restore()函数加载已经保存的模型:
import tensorflow as tf
#声明两个变量并计算其加和
a = tf.Variable(tf.constant([1.0,2.0],shape=[2]), name="a")
b = tf.Variable(tf.constant([3.0,4.0],shape=[2]), name="b")
result=a+b
#定义Saver类对象用于保存模型
saver=tf.train.Saver()
with tf.Session() as sess:
# 使用restore()函数加载已经保存的模型
saver.restore(sess,"model/model.ckpt")
print(sess.run(result))
# 输出为[4. 6.]
# restore函数的原型是restore(self,sess,save_path)
有时我们不希望重复定义计算图上的运算,因为这样过程太繁琐.可以把模型的计算图也恢复出来,函数import_meta_graph()直接加载已经持久化的计算图,其输入参数为.meta文件路径,它返回一个Saver类实例,再调用这个实例的restore()函数就可以恢复其参数.
import tensorflow as tf
# 省略了定义图上计算的过程,取而代之的是通过.meta文件直接加载持久化的图,
meta_graph = tf.train.import_meta_graph("model/model.ckpt.meta")
with tf.Session() as sess:
# 使用restore()函数加载已经保存的模型
meta_graph.restore(sess,"model/model.ckpt")
# 获取默认计算图上指定节点处的张量
print(sess.run(tf.get_default_graph().get_tensor_by_name("add:0")))
#输出结果为Tensor(“add:0”, shape=(2,), dtype=float32)
# import_meta_graph函数的原型是
# import_meta_graph(meta_graph_or_file,clear_devics,import_scope,kwargs)
# get_tensor_by_name()函数的原型是get_tensor_by_name(self,name)
.ckpt.meat文件保存了计算图的结构,通过import_meta_graph()函数将计算图导入程序,之后在会话中通过restore()函数对计算图中的变量值进行加载.
持久化的MNIST手写字识别
mnist_train.py程序:
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/home/jiangziyang/MNIST_data",one_hot=True)
batch_size = 100
learning_rate = 0.8
learning_rate_decay = 0.999
max_steps = 30000
#更改前向传播算法的定义,将得到权重参数和偏执参数的过程封装到了一个函数中
def hidden_layer(input_tensor,regularizer,name):
#要多多体会使用变量空间来管理变量的方便性。使用get_variable()函数会在训练神经
#网络时创建这些变量而在测试过程中通过保存的模型加载这些变量的取值,在测试过程中
#可以在加载变量时将滑动平均变量重命名,这样就会在测试过程中使用变量的滑动平均值
with tf.variable_scope("hidden_layer"):
weights = tf.get_variable("weights", [784, 500],
initializer=tf.truncated_normal_initializer(stddev=0.1))
#如果调用该函数时传入了正则化的方法 ,那么在这里将参数求
if regularizer!=None:
tf.add_to_collection("losses",regularizer(weights))
biases = tf.get_variable("biases", [500], initializer=tf.constant_initializer(0.0))
hidden_layer = tf.nn.relu(tf.matmul(input_tensor, weights) + biases)
with tf.variable_scope("hidden_layer_output"):
weights = tf.get_variable("weights", [500, 10],
initializer=tf.truncated_normal_initializer(stddev=0.1))
if regularizer!=None:
tf.add_to_collection("losses",regularizer(weights))
biases = tf.get_variable("biases", [10], initializer=tf.constant_initializer(0.0))
hidden_layer_output = tf.matmul(hidden_layer, weights) + biases
return hidden_layer_output
#定义输出输出的部分没变
x = tf.placeholder(tf.float32, [None,784],name="x-input")
y_ = tf.placeholder(tf.float32, [None,10],name="y-output")
#定义L2正则化的办法被提前
regularizer = tf.contrib.layers.l2_regularizer(0.0001)
#将L2正则化的办法传入到hidden_layer()函数中
y = hidden_layer(x,regularizer,name="y")
training_step = tf.Variable(0,trainable=False)
averages_class = tf.train.ExponentialMovingAverage(0.99,training_step)
averages_op = averages_class.apply(tf.trainable_variables())
#不再定义average_y,因为average_y只在比较正确率时有用,
#在模型保存的程序中我们只输出损失
#average_y = hidden_layer(x,averages_class,name="average_y",reuse=True)
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y,
labels=tf.argmax(y_, 1)) #tf.argmax()
#计算总损失
loss = tf.reduce_mean(cross_entropy)+tf.add_n(tf.get_collection("losses"))
laerning_rate = tf.train.exponential_decay(learning_rate,training_step,
mnist.train.num_examples/batch_size,learning_rate_decay)
train_step= tf.train.GradientDescentOptimizer(learning_rate).\
minimize(loss,global_step=training_step)
#也可以采用train_op = tf.group(train_step,averages_op)的形式
with tf.control_dependencies([train_step, averages_op]):
train_op = tf.no_op(name="train")
#初始化Saver持久化类
saver=tf.train.Saver()
with tf.Session() as sess:
tf.global_variables_initializer().run()
#进行30000轮到训练
for i in range(max_steps):
x_train, y_train = mnist.train.next_batch(batch_size)
_, loss_value, step = sess.run([train_op, loss, training_step],
feed_dict={x: x_train, y_: y_train})
#每隔1000轮训练就输出当前训练batch上的损失函数大小,并保存一次模型
if i % 1000 == 0:
print("After %d training step(s), loss on training batch is "
"%g." % (step, loss_value))
#保存模型的时候给出了global_step参数,这样可以让每个模型文件都添加
#代表了训练轮数的后缀,这样做的原因是方便检索
saver.save(sess, "/home/jiangziyang/model/mnist_model/mnist_model.ckpt",
global_step=training_step)
mnist_evaluate.py程序:
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/home/jiangziyang/MNIST_data", one_hot=True)
# 定义相同的前向传播过程,要保持命名空间和变量名的一致
def hidden_layer(input_tensor, regularizer, name):
with tf.variable_scope("hidden_layer"):
weights = tf.get_variable("weights", [784, 500],
initializer=tf.truncated_normal_initializer(stddev=0.1))
if regularizer != None:
tf.add_to_collection("losses", regularizer(weights))
biases = tf.get_variable("biases", [500], initializer=tf.constant_initializer(0.0))
hidden_layer = tf.nn.relu(tf.matmul(input_tensor, weights) + biases)
with tf.variable_scope("hidden_layer_output"):
weights = tf.get_variable("weights", [500, 10],
initializer=tf.truncated_normal_initializer(stddev=0.1))
if regularizer != None:
tf.add_to_collection("losses", regularizer(weights))
biases = tf.get_variable("biases", [10], initializer=tf.constant_initializer(0.0))
hidden_layer_output = tf.matmul(hidden_layer, weights) + biases
return hidden_layer_output
x = tf.placeholder(tf.float32, [None, 784], name="x-input")
y_ = tf.placeholder(tf.float32, [None, 10], name="y-input")
# 因为测试时不必关注正则化损失的值,所以不会传入正则化的办法
y = hidden_layer(x, None, name="y")
# 计算正确率的过程也基本和第六章的样例一致
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
variable_averages = tf.train.ExponentialMovingAverage(0.99)
# 通过变量重命名的方式加载模型,这里使用了滑动平均类提供的variables_to_restore()
# 于是就免去了在前向传播过程中调用求解滑动平均的函数来获取滑动平均值的过程
saver = tf.train.Saver(variable_averages.variables_to_restore())
with tf.Session() as sess:
validate_feed = {x: mnist.validation.images, y_: mnist.validation.labels}
test_feed = {x: mnist.test.images, y_: mnist.test.labels}
# get_checkpoint_state()函数会通过checkpoint文件自动找到目录中最新模型的文件名
# 函数原型get_checkpoint_state(checkpoint_dir,latest_filename)
ckpt = tf.train.get_checkpoint_state("/home/jiangziyang/model/mnist_model/")
# 加载模型
saver.restore(sess, ckpt.model_checkpoint_path)
# 通过文件名得到模型保存时迭代的轮数
global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1]
print("The latest ckpt is mnist_model.ckpt-%s" % (global_step))
# 输出The latest ckpt is mnist_model.ckpt-29001
# 计算在验证数据集上的准确率并打印出来
accuracy_score = sess.run(accuracy, feed_dict=validate_feed)
print("After %s training step(s), validation accuracy = %g%%"
% (global_step, accuracy_score * 100))
# 输出After 29001 training step(s), validation accuracy = 98.62%
# 计算在测试数据集上的准确率并打印出来
test_accuracy = sess.run(accuracy, feed_dict=test_feed)
print("After %s trainging step(s) ,test accuracy = %g%%"
% (global_step, test_accuracy * 100))
# 输出After 29001 trainging step(s) ,test accuracy = 98.51%