A quick complete tutorial to save and restore Tensorflow models

In this Tensorflow tutorial, I shall explain:

  1. How does a Tensorflow model look like?
  2. How to save a Tensorflow model?
  3. How to restore a Tensorflow model for prediction/transfer learning?
  4. How to work with imported pretrained models for fine-tuning and modification

This tutorial assumes that you have some idea about training a neural network. Otherwise, please follow this tutorial and come back here.

1.What is a Tensorflow model?:

After you have trained a neural network, you would want to save it for future use and deploying to production. So, what is a Tensorflow model? Tensorflow model primarily contains the network design or graph and values of the network parameters that we have trained. Hence, Tensorflow model has two main files:

a) Meta graph:

This is a protocol buffer which saves the complete Tensorflow graph; i.e. all variables, operations, collections etc. This file has .meta extension.

b) Checkpoint file:

This is a binary file which contains all the values of the weights, biases, gradients and all the other variables saved. This file has an extension .ckpt. However, Tensorflow has changed this from version 0.11. Now, instead of single .ckpt file, we have two files:

.data file is the file that contains our training variables and we shall go after it.

Along with this, Tensorflow also has a file named checkpoint which simply keeps a record of latest checkpoint files saved.

So, to summarize, Tensorflow models for versions greater than 0.10 look like this:

Tensorflow tutorial

while Tensorflow model before 0.11 contained only three files:

Now that we know how a Tensorflow model looks like, let’s learn how to save the model.

2. Saving a Tensorflow model:

Let’s say, you are training a convolutional neural network for image classification. As a standard practice, you keep a watch on loss and accuracy numbers. Once you see that the network has converged, you can stop the training manually or you will run the training for fixed number of epochs. After the training is done, we want to save all the variables and network graph to a file for future use. So, in Tensorflow, you want to save the graph and values of all the parameters for which we shall be creating an instance of tf.train.Saver() class.

saver = tf.train.Saver()

Remember that Tensorflow variables are only alive inside a session. So, you have to save the model inside a session by calling save method on saver object you just created.

Here, sess is the session object, while ‘my-test-model’ is the name you want to give your model. Let’s see a complete example:

If we are saving the model after 1000 iterations, we shall call save by passing the step count:

saver.save(sess, 'my_test_model',global_step=1000)

This will just append ‘-1000’ to the model name and following files will be created:

Let’s say, while training, we are saving our model after every 1000 iterations, so .meta file is created the first time(on 1000th iteration) and we don’t need to recreate the .meta file each time(so, we don’t save the .meta file at 2000, 3000.. or any other iteration). We only save the model for further iterations, as the graph will not change. Hence, when we don’t want to write the meta-graph we use this:

If you want to keep only 4 latest models and want to save one model after every 2 hours during training you can use max_to_keep and keep_checkpoint_every_n_hours like this.

 

Note, if we don’t specify anything in the tf.train.Saver(), it saves all the variables. What if, we don’t want to save all the variables and just some of them. We can specify the variables/collections we want to save. While creating the tf.train.Saver instance we pass it a list or a dictionary of variables that we want to save. Let’s look at an example:

This can be used to save specific part of Tensorflow graphs when required.

3. Importing a pre-trained model:

If you want to use someone else’s pre-trained model for fine-tuning, there are two things you need to do:

a) Create the network:

You can create the network by writing python code to create each and every layer manually as the original model. However, if you think about it, we had saved the network in .meta file which we can use to recreate the network using tf.train.import() function like this: saver = tf.train.import_meta_graph('my_test_model-1000.meta')

Remember, import_meta_graph appends the network defined in .meta file to the current graph. So, this will create the graph/network for you but we still need to load the value of the parameters that we had trained on this graph.

b) Load the parameters:

We can restore the parameters of the network by calling restore on this saver which is an instance of tf.train.Saver() class.

After this, the value of tensors like w1 and w2 has been restored and can be accessed:

So, now you have understood how saving and importing works for a Tensorflow model. In the next section, I have described a practical usage of above to load any pre-trained model.

4. Working with restored models

Now that you have understood how to save and restore Tensorflow models, Let’s develop a practical guide to restore any pre-trained model and use it for prediction, fine-tuning or further training. Whenever you are working with Tensorflow, you define a graph which is fed examples(training data) and some hyperparameters like learning rate, global step etc. It’s a standard practice to feed all the training data and hyperparameters using placeholders. Let’s build a small network using placeholders and save it. Note that when the network is saved, values of the placeholders are not saved.

Now, when we want to restore it, we not only have to restore the graph and weights, but also prepare a new feed_dict that will feed the new training data to the network. We can get reference to these saved operations and placeholder variables via graph.get_tensor_by_name() method.

If we just want to run the same network with different data, you can simply pass the new data via feed_dict to the network.

What if you want to add more operations to the graph by adding more layers and then train it. Of course you can do that too. See here:

But, can you restore part of the old graph and add-on to that for fine-tuning ? Of-course you can, just access the appropriate operation by graph.get_tensor_by_name() method and build graph on top of that. Here is a real world example. Here we load a vgg pre-trained network using meta graph and change the number of outputs to 2 in the last layer for fine-tuning with new data.

Hopefully, this gives you very clear understanding of how Tensorflow models are saved and restored. Please feel free to share your questions or doubts in the comments section.

原文地址: http://cv-tricks.com/tensorflow-tutorial/save-restore-tensorflow-models-quick-complete-tutorial/

TensorFlow是一个开源的机器学习框架,用于构建和训练各种机器学习模型。TensorFlow提供了丰富的编程接口和工具,使得开发者能够轻松地创建、训练和部署自己的模型。 TensorFlow TutorialTensorFlow官方提供的学习资源,旨在帮助新手快速入门。该教程详细介绍了TensorFlow的基本概念、常用操作和各种模型的构建方法。 在TensorFlow Tutorial中,首先会介绍TensorFlow的基本工作原理和数据流图的概念。通过理解数据流图的结构和运行过程,可以更好地理解TensorFlow的工作方式。 接下来,教程会详细介绍TensorFlow的核心组件,例如张量(Tensor)、变量(Variable)和操作(Operation)。这些组件是构建和处理模型的基本元素,通过使用它们可以创建复杂的神经网络和其他机器学习模型。 在教程的后半部分,会介绍如何使用TensorFlow构建不同类型的模型,例如深度神经网络(DNN)、卷积神经网络(CNN)和递归神经网络(RNN)。每个模型都会有详细的代码示例和实践任务,帮助学习者掌握相关知识和技能。 此外,教程还包含了关于模型的训练、评估和优化的内容,以及如何使用TensorBoard进行可视化和调试。 总结来说,TensorFlow Tutorial提供了全面而详细的学习资源,通过学习该教程,可以快速入门TensorFlow,并且掌握构建和训练机器学习模型的方法。无论是初学者还是有一定经验的开发者,都可以从中受益并扩展自己的机器学习技能。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值