文章目录
tensorflow/contrib/slim官方教程]( https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim)翻译
Tensorflow-Slim
TF-Slim是一个Tensorflow当中针对复杂模型定义、训练和评估的轻量级的库。tf-slim的组件可以随意地跟底层的tensorflow组件、其它tensorflow相关框架混合使用。
用法
import tensorflow.contrib.slim as slim
TF-Slim中的不同组件
TF-Slim由多个相互独立存在的部分组成。这些部分主要是:
- arg_scope: 提供一个新的叫做arg_scope的命名空间,可以让用户在它的命名空间内针对特定操作定义一些默认的参数。
- data: 包含TF-slim的dataset模块的定义,data providers, paralllel_reader, decoding等类的用法。
- evaluation: 包含一些评估模型的常用操作。
- layers: 包含使用tensorflow构建模型的高层函数。
- learning: 包含训练模型的常用操作。
- losses::包含一些常用的损失函数。
- metrics:包含一些常用的评价模型性能的指标。
- nets: 包含一些常用的网络模型定义,比如VGG和AlexNet。
- queues: 提供了一个环境管理器,可以简单安全地开启和关闭QueueRunners。
- regularizers:包含了对权重参数进行正则化的操作。
- variables:提供了使变量创建和操作更方便的装饰器。
Variables
在原始的tensorflow中创建一个变量要么是赋一个预定义的值或通过一种初始化机制进行赋值(如一个高斯分布下随机采样)。另外,如果一个变量需要在特定的设备上创建,比如GPU,则必须在程序当中显式地指定【这些步骤每一步在原来的tensorflow中都必须通过一行代码,一个不同的tensorflow函数来创建】。为了减少变量创建的代码,TF-Silm在variables.py中提供了一组简单的装饰器,可以使创建变量更加简单。举例如下:
weights = slim.variable('weights',
shape=[10, 10, 3 , 3],
initializer=tf.truncated_normal_initializer(stddev=0.1),
regularizer=slim.l2_regularizer(0.05),
device='/CPU:0')
在底层的tensorflow代码当中,有两种类型的变量:常规变量和局部(短暂)变量。大多数变量是常规变量:一旦被创建,可以使用saver类存储在磁盘上。局部变量是那些只在session创建期间存在,不会存储到磁盘上的变量。TF-Slim通过定义了模型变量进行区分了这两者,模型变量代表模型中的参数。非模型变量是所有在训练和评估过程中才会被要求使用的其它变量。如global_step。同样地,moving average variables可能是模型变量的镜像,但moving averages本身不是模型变量。
模型变量和常规变量都可以通过TF-Slim轻易创建和恢复:
# Model Variables
weights = slim.model_variable('weights',
shape=[10, 10, 3 , 3],
initializer=tf.truncated_normal_initializer(stddev=0.1),
regularizer=slim.l2_regularizer(0.05),
device='/CPU:0')
model_variables = slim.get_model_variables()
# Regular variables
my_var = slim.variable('my_var',
shape=[20, 1],
initializer=tf.zeros_initializer())
regular_variables_and_model_variables = slim.get_variables()
这是如何实现的呢?当你通过TF-Slim的layers或直接通过slim.model_variable函数创建模型变量时,TF-Slim会直接将变量添加到tf.GraphKeys.MODEL_VARIABLES集合中。如果你想让自己定义的层或变量也用TF-Slim发现并进行管理的话,可以通过这个函数来实现:
my_model_variable = CreateViaCustomCode()
# Letting TF-Slim know about the additional variable.
slim.add_model_variable(my_model_variable)
Layers
尽管Tensorflow中的操作非常具有扩展性,但是开发者通常从一些更高层的概念来考虑神经网络的创建,比如“layers”,“losses”,“metrics”,“networks”。一些层比如说卷积层、全连接层、BN层比一个简单的tensorflow操作更抽象,通常包含了好几个tensorflow操作。另外,有些层会与一些特殊变量(被调的参数)相联系,原始的tensorflow操作则不是这样。举个例子,神经网络中的一个卷积层由几个底层的操作来组成:
- 创建weight和bias变量
- 用weight和前一层的input进行卷积
- bias加上卷积的结果
- 使用激活函数
使用Tensorflow的原始代码,写起来非常费力:
input = ...
with tf.name_scope('conv1_1') as scope:
kernel = tf.Variable(tf.truncated_normal([3, 3, 64, 128], dtype=tf.float32,
stddev=1e-1), name='weights')
conv = tf.nn.conv2d(input, kernel, [1, 1, 1, 1], padding='SAME')
biases = tf.Variable(tf.constant(0.0, shape=[128], dtype=tf.float32),
trainable=True, name='biases')
bias = tf.nn.bias_add(conv, biases)
conv1 = tf.nn.relu(bias, name=scope)
为了减少重复代码,TF-Slim对创建抽象的层提供了更方便的操作:
input = ...
net = slim.conv2d(input, 128, [3, 3], scope='conv1_1')
TF-Slim为大量构建神经网络需要的组件提供标准化的实现。这些组件包括:
TF-Slim也提供了两种叫做repeat和stack的元操作,允许用户重复地执行相同的操作。注意到slim.repeat不仅仅使用了相同的内联参数,它也可以自动使用命名空间。具体地说,下面repeat中将会自动将三个卷积操作包含在’conv3/conv3_1’, ‘conv3/conv3_2’ and 'conv3/conv3_3’命名空间里面。
net = ...
net = slim.conv2d(net, 256, [3, 3], scope='conv3_1')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_2')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_3')
net = slim.max_pool2d(net, [2, 2], scope='pool2')
##############################################
#第一种
##############################################
net = ...
for i in range(3):
net = slim.conv2d(net, 256, [3, 3], scope='conv3_%d' % (i+1))
net = slim.max_pool2d(net, [2, 2], scope='pool2')
##############################################
#第二种
##############################################
net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
net = slim.max_pool2d(net, [2, 2], scope='pool2')
##############################################
#第三种
##############################################
另外,TF-Slim的slim.stack操作允许调用者重复地使用一些相同的操作来创建一个几层的stack,这些操作可以使用不同的参数。slim.stack也会为每一个操作创建一个新的tf.variable_scope。举例如下:
# Verbose way:
x = slim.fully_connected(x, 32, scope='fc/fc_1')
x = slim.fully_connected(x, 64, scope='fc/fc_2')
x = slim.fully_connected(x, 128, scope='fc/fc_3')
# Equivalent, TF-Slim way using slim.stack:
slim.stack(x, slim.fully_connected, [32, 64, 128], scope='fc')
因此,我们也可以使用stack操作来简化一个多层卷积的实现:
# Verbose way:
x = slim.conv2d(x, 32, [3, 3], scope='core/core_1')
x = slim.conv2d(x, 32, [1, 1], scope='core/core_2')
x = slim.conv2d(x, 64, [3, 3], scope='core/core_3')
x = slim.conv2d(x, 64, [1, 1], scope='core/core_4')
# Using stack:
slim.stack(x, slim.conv2d, [(32, [3, 3]), (32, [1, 1]), (64, [3, 3]), (64, [1, 1])], scope='core')
Scopes
谈到Tensorflow中命名空间类型(name_scope, variable_scope), TF-Slim增加了一个新的命名空间叫做arg_scope。这个新的命名空间允许用户指定一个或多个操作以及将会传递给在arg_scope定义的每一个操作的参数集合。通过例子可以很好地展示,举例如下:
net = slim.conv2d(inputs, 64, [11, 11], 4, padding='SAME',
weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
weights_regularizer=slim.l2_regularizer(0.0005), scope='conv1')
net = slim.conv2d(net, 128, [11, 11], padding='VALID',
weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
weights_regularizer=slim.l2_regularizer(0.0005), scope='conv2')
net = slim.conv2d(net, 256, [11, 11], padding='SAME',
weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
weights_regularizer=slim.l2_regularizer(0.0005), scope='conv3')
可以很清楚地看到这三个卷积有很多相同的参数,使用arg_scope可以这样更简练地实现:
with slim.arg_scope([slim.conv2d], padding='SAME',
weights_initializer=tf.truncated_normal_initializer(stddev=0.01)
weights_regularizer=slim.l2_regularizer(0.0005)):
net = slim.conv2d(inputs, 64, [11, 11], scope='conv1')
net = slim.conv2d(net, 128, [11, 11], padding='VALID', scope='conv2')
net = slim.conv2d(net, 256, [11, 11], scope='conv3')
正如上面的代码片段所指出,使用aarg_scope使代码更加干净、简单、容易维护。注意到尽管有些参数值在arg_scope中被指定,它们也依然可以在局部重新覆盖。比如上面的padding参数一开始设为’SAME’,在第二个卷积中重写为‘VALID’。
arg_scope可以多层嵌套,举例如下:
with slim.arg_scope([slim.conv2d, slim.fully_connected],
activation_fn=tf.nn.relu,
weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
weights_regularizer=slim.l2_regularizer(0.0005)):
with slim.arg_scope([slim.conv2d], stride=1, padding='SAME'):
net = slim.conv2d(inputs, 64, [11, 11], 4, padding='VALID', scope='conv1')
net = slim.conv2d(net, 256, [5, 5],
weights_initializer=tf.truncated_normal_initializer(stddev=0.03),
scope='conv2')
net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc')
用我们VGG16的模型定义的代码来进一步说明arg_scope在实际当中的运用:
def vgg16(inputs):
with slim.arg_scope([slim.conv2d, slim.fully_connected],
activation_fn=tf.nn.relu,
weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),
weights_regularizer=slim.l2_regularizer(0.0005)):
net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')
net = slim.max_pool2d(net, [2, 2], scope='pool1')
net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')
net = slim.max_pool2d(net, [2, 2], scope='pool2')
net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
net = slim.max_pool2d(net, [2, 2], scope='pool3')
net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')
net = slim.max_pool2d(net, [2, 2], scope='pool4')
net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')
net = slim.max_pool2d(net, [2, 2], scope='pool5')
net = slim.fully_connected(net, 4096, scope='fc6')
net = slim.dropout(net, 0.5, scope='dropout6')
net = slim.fully_connected(net, 4096, scope='fc7')
net = slim.dropout(net, 0.5, scope='dropout7')
net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc8')
return net
Training Models
训练一个Tensorflow模型要求有模型,损失函数,梯度计算和能够不断计算权重相对于损失函数的梯度以及由计算出来的梯度更新权重的操作。TF-Slim提供了常用的损失函数和帮助运行训练和评估的帮助函数。
Losses
对于分类问题来说,损失函数是真是分布和预测的概率分布在所有类别上的交叉熵。对于回归问题来说,通常是预测值和真实值之间差值的平方和。
对于某些特定模型,比如多任务的学习模型,要求同时使用多个损失函数。换句话说,最终要最小化的损失函数是多个不同类型的损失函数的和。举例来说,一个预测图片场景类型和相机深度的模型,它的损失函数将是分类损失和深度预测损失的和。TF-Slim通过losses模块提供了一种简单易用的机制去定义和追踪损失函数。举例如下:
###############################
#单个损失函数
##############################
import tensorflow as tf
import tensorflow.contrib.slim.nets as nets
vgg = nets.vgg
# Load the images and labels.
images, labels = ...
# Create the model.
predictions, _ = vgg.vgg_16(images)
# Define the loss functions and get the total loss.
loss = slim.losses.softmax_cross_entropy(predictions, labels)
#################################################
#多个损失函数
#################################################
# Load the images and labels.
images, scene_labels, depth_labels = ...
# Create the model.
scene_predictions, depth_predictions = CreateMultiTaskModel(images)
# Define the loss functions and get the total loss.
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)
# The following two lines have the same effect:
total_loss = classification_loss + sum_of_squares_loss
total_loss = slim.losses.get_total_loss(add_regularization_losses=False)
在上面的代码中,可以通过将它们加在一起,也可以调用slim.losses.get_total_loss()来获得total_loss。这个是如何实现的呢?当你通过一个TF-Slim创建了一个损失函数,TF-Slim就会将这个loss加入到一个tensorflow特定的损失函数集合当中。这样要么手动管理total loss,要么是TF-Slim帮助你管理。
如果你想让TF-Slim帮助你管理你自己定义的损失函数,可以这样做:
# Load the images and labels.
images, scene_labels, depth_labels, pose_labels = ...
# Create the model.
scene_predictions, depth_predictions, pose_predictions = CreateMultiTaskModel(images)
# Define the loss functions and get the total loss.
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)
pose_loss = MyCustomLossFunction(pose_predictions, pose_labels)
slim.losses.add_loss(pose_loss) # Letting TF-Slim know about the additional loss.
# The following two ways to compute the total loss are equivalent:
regularization_loss = tf.add_n(slim.losses.get_regularization_losses())
total_loss1 = classification_loss + sum_of_squares_loss + pose_loss + regularization_loss
# (Regularization Loss is included in the total loss by default).
total_loss2 = slim.losses.get_total_loss()
TrainingLoop
TF-Slim提供了一个简单但是非常强大的工具集来训练模型,源码在learning.py中。这些工具可以让你在训练函数中重复地计算loss,梯度,保存模型,操纵梯度。举个例子,一旦我们指定了模型,损失函数,优化方法,我们就可以调用slim.learning.create_train_op和slim.learning.train来执行优化操作。
g = tf.Graph()
# Create the model and specify the losses...
...
total_loss = slim.losses.get_total_loss()
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
# create_train_op ensures that each time we ask for the loss, the update_ops
# are run and the gradients being computed are applied too.
train_op = slim.learning.create_train_op(total_loss, optimizer)
logdir = ... # Where checkpoints are stored.
slim.learning.train(
train_op,
logdir,
number_of_steps=1000,
save_summaries_secs=300,
save_interval_secs=600):
在上面的例子中,slim.learning.train用train_op一个计算loss和梯度的操作来作为参数。logdir指定保存checkpoints和模型的目录。number_of_steps指定迭代训练的次数,save_summaries_sec指定每隔多长时间记录一次summaries,save_interval_secs指定每隔多长时间保存一次checkpoint【这些都是以秒作为单位】。
Working Example:Training the VGG16 model
import tensorflow as tf
import tensorflow.contrib.slim.nets as nets
slim = tf.contrib.slim
vgg = nets.vgg
...
train_log_dir = ...
if not tf.gfile.Exists(train_log_dir):
tf.gfile.MakeDirs(train_log_dir)
with tf.Graph().as_default():
# Set up the data loading:
images, labels = ...
# Define the model:
predictions = vgg.vgg_16(images, is_training=True)
# Specify the loss function:
slim.losses.softmax_cross_entropy(predictions, labels)
total_loss = slim.losses.get_total_loss()
tf.summary.scalar('losses/total_loss', total_loss)
# Specify the optimization scheme:
optimizer = tf.train.GradientDescentOptimizer(learning_rate=.001)
# create_train_op that ensures that when we evaluate it to get the loss,
# the update_ops are done and the gradient updates are computed.
train_tensor = slim.learning.create_train_op(total_loss, optimizer)
# Actually runs training.
slim.learning.train(train_tensor, train_log_dir)
Fine-Tuning Existing Models
Brief Recap on Restoring Variables from a checkpoint
# Create some variables.
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
...
# Add ops to restore all the variables.
restorer = tf.train.Saver()
# Add ops to restore some variables.
restorer = tf.train.Saver([v1, v2])
# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
# Restore variables from disk.
restorer.restore(sess, "/tmp/model.ckpt")
print("Model restored.")
# Do some work with the model
...
更详细地,可以参考Restoring Variables和Chosing which variables to save and restore
Partially Restoring Models
在微调一个预训练模型时,通常只需要恢复部分参数。在TF-Slim中可以这样做:
# Create some variables.
v1 = slim.variable(name="v1", ...)
v2 = slim.variable(name="nested/v2", ...)
...
# Get list of variables to restore (which contains only 'v2'). These are all
# equivalent methods:
variables_to_restore = slim.get_variables_by_name("v2")
# or
variables_to_restore = slim.get_variables_by_suffix("2")
# or
variables_to_restore = slim.get_variables(scope="nested")
# or
variables_to_restore = slim.get_variables_to_restore(include=["nested"])
# or
variables_to_restore = slim.get_variables_to_restore(exclude=["v1"])
# Create the saver which will be used to restore the variables.
restorer = tf.train.Saver(variables_to_restore)
with tf.Session() as sess:
# Restore variables from disk.
restorer.restore(sess, "/tmp/model.ckpt")
print("Model restored.")
# Do some work with the model
...
根据多个不同的变量名恢复模型
当从一个checkpoint文件恢复变量时,Saver类是根据在checkpoint文件中的变量名来与当前计算图中的变量来进行一一对应。在上面的代码中,在创建一个Saver类时提供了一个需要恢复的变量列表。在这种情况下,需要从checkpoint文件中定位变量的变量名是创建Saver类的参数var.op.name中隐式得到的。
当checkpoint中变量名与计算图的变量名一样时,这种方式才起作用。但是有时我们想要恢复的模型的变量名会与当前计算图的变量名不一样。这时,我们必须提供给Saver类一个映射checkpoint中的变量名到计算图中变量名的字典,举例如下:
# Assuming than 'conv1/weights' should be restored from 'vgg16/conv1/weights'
def name_in_checkpoint(var):
return 'vgg16/' + var.op.name
# Assuming than 'conv1/weights' and 'conv1/bias' should be restored from 'conv1/params1' and 'conv1/params2'
def name_in_checkpoint(var):
if "weights" in var.op.name:
return var.op.name.replace("weights", "params1")
if "bias" in var.op.name:
return var.op.name.replace("bias", "params2")
variables_to_restore = slim.get_model_variables()
variables_to_restore = {name_in_checkpoint(var):var for var in variables_to_restore}
restorer = tf.train.Saver(variables_to_restore)
with tf.Session() as sess:
# Restore variables from disk.
restorer.restore(sess, "/tmp/model.ckpt")
在一个不同的学习任务上对模型进行Fine-Tuning
如果我们需要在一个预训练好的VGG16模型上进行网络的微调,VGG16原先是在ImageNet上训练,ImageNet有1000个类。如果想把它用到Pascal VOC数据集上,后者只要20个类。
# Load the Pascal VOC data
image, label = MyPascalVocDataLoader(...)
images, labels = tf.train.batch([image, label], batch_size=32)
# Create the model
predictions = vgg.vgg_16(images)
train_op = slim.learning.create_train_op(...)
# Specify where the Model, trained on ImageNet, was saved.
model_path = '/path/to/pre_trained_on_imagenet.checkpoint'
# Specify where the new model will live:
log_dir = '/path/to/my_pascal_model_dir/'
# Restore only the convolutional layers:
variables_to_restore = slim.get_variables_to_restore(exclude=['fc6', 'fc7', 'fc8'])
init_fn = assign_from_checkpoint_fn(model_path, variables_to_restore)
# Start training.
slim.learning.train(train_op, log_dir, init_fn=init_fn)