深度学习笔记——深度学习框架TensorFlow(十)[Creating Estimators in tf.contrib.learn]

Creating Estimators in tf.contrib.learn

tf.contrib.learn框架通过其高级别的Estimator API可以轻松构建和训练机器学习模型。 Estimator提供可以实例化的类，以快速配置常见的模型类型，如回归和分类器：

LinearClassifier：构建线性分类模型
LinearRegressor：构建线性回归模型
DNNClassifier：构建神经网络分类模型
DNNRegressor：构建神经网络回归模型

但是，如果tf.contrib.learn的预定义模型类型中没有一个满足您的需求呢？您可能需要对模型配置进行更精细的控制，例如自定义用于优化的损耗功能的能力，或为每个神经网络层指定不同的激活功能。或者也许您正在实施排名或推荐系统，分类器和回归算法都不适合生成预测。

本教程将介绍如何使用tf.contrib.learn中提供的构建块来创建自己的Estimator，它将基于物理测量来预测宝石的年龄。您将学习如何执行以下操作：

实例化一个Estimator
构建一个自定义模型函数
使用tf.contrib.layers配置一个神经网络
从tf.contrib.losses中选择一个合适的损失函数
为你的模型定义一个训练操作
生成并返回预测

Prerequisites

本教程假设您已经知道tf.contrib.learn API基础知识，例如功能列和fit（）操作。如果您以前从未使用过tf.contrib.learn，或需要进行复习，则应首先查看以下教程：

tf.contrib.learn Quickstart: Quick introduction to training a neural network using tf.contrib.learn.
TensorFlow Linear Model Tutorial: Introduction to feature columns, and an overview on building a linear classifier in tf.contrib.learn.

An Abalone Age Predictor

可以通过其壳上的环数量来估计鲍鱼（海螺）的年龄。然而，因为这个任务需要在显微镜下切割，染色和观察外壳，所以需要找到可以预测年龄的其他测量。

鲍鱼数据集包含以下鲍鱼功能数据：

这里写图片描述
简略描述：
Length：鲍鱼长度
Diameter：鲍鱼直径
Height：鲍鱼高度
Whole Weight：整个鲍鱼的重量（克）
Shucket Weight：鲍鱼肉重量（克）
Viscera Weight：鲍鱼肠重（克），出血后
Shell Weight：干鲍鱼壳重量（克）

预测的标签是环的数量，作为鲍鱼年龄的代理。
这里写图片描述

数据下载：

abalone_train.csv：http://download.tensorflow.org/data/abalone_train.csv
abalone_test.csv：http://download.tensorflow.org/data/abalone_test.csv
abalone_predict：http://download.tensorflow.org/data/abalone_predict.csv
完整代码：https://github.com/tensorflow/tensorflow/blob/r0.12/tensorflow/examples/tutorials/estimators/abalone.py

Loading Abalone CSV Data into TensorFlow Datasets

要将abalone数据集导入到模型中，您需要下载并将CSV加载到TensorFlow数据集中。首先，添加一些标准的Python和TensorFlow导入：

import tempfile
import urllib
import numpy as np
import tensorflow as tf
tf.logging.set_verbosity(tf.logging.INFO)

然后定义标志以允许用户通过命令行可选地指定用于训练，测试和预测数据集的CSV文件（默认情况下，文件将从tensorflow.org下载），并启用日志记录：

flags = tf.app.flags
FLAGS = flags.FLAGS
flags.DEFINE_string(
    "train_data",
    "",
    "Path to the test data.")
flags.DEFINE_string(
    "test_data",
    "",
    "Path to the test data.")
tf.logging.set_verbosity(tf.logging.INFO)

然后定义一个函数来加载CSV（从命令行选项中指定的文件或从tensorflow.org下载的文件）：

def maybe_download():
  """Maybe downloads training data and returns train and test file names."""
  if FLAGS.train_data:
    train_file_name = FLAGS.train_data
  else:
    train_file = tempfile.NamedTemporaryFile(delete=False)
    urllib.urlretrieve("http://download.tensorflow.org/data/abalone_train.csv", train_file.name)
    train_file_name = train_file.name
    train_file.close()
    print("Training data is downloaded to %s" % train_file_name)

  if FLAGS.test_data:
    test_file = tempfile.NamedTemporaryFile(delete=False)
    urllib.urlretrieve("http://download.tensorflow.org/data/abalone_test.csv", test_file.name)
    test_file_name = test_file.name
    test_file.close()
    print("Test data is downloaded to %s" % test_file_name)

  if FLAGS.predict_data:
    predict_file_name = FLAGS.predict_data
  else:
    predict_file = tempfile.NamedTemporaryFile(delete=False)
    urllib.urlretrieve("http://download.tensorflow.org/data/abalone_predict.csv", predict_file.name)
    predict_file_name = predict_file.name
    predict_file.close()
    print("Prediction data is downloaded to %s" % predict_file_name)

  return train_file_name, test_file_name, predict_file_name

最后，创建main（）并将鲍鱼的CSV加载到Datasets中：

def main(unused_argv):
  # Load datasets
  abalone_train, abalone_test, abalone_predict = maybe_download()

  # Training examples
  training_set = tf.contrib.learn.datasets.base.load_csv_without_header(
      filename=abalone_train,
      target_dtype=np.int,
      features_dtype=np.float64)

  # Test examples
  test_set = tf.contrib.learn.datasets.base.load_csv_without_header(
      filename=abalone_test,
      target_dtype=np.int,
      features_dtype=np.float64)

  # Set of 7 examples for which to predict abalone ages
  prediction_set = tf.contrib.learn.datasets.base.load_csv_without_header(
      filename=abalone_predict,
      target_dtype=np.int,
      features_dtype=np.float64)

if __name__ == "__main__":
  tf.app.run()

Instantiating an Estimator

当使用tf.contrib.learn提供的类之一（如DNNClassifier）定义模型时，可以在构造函数中提供所有配置参数，例如：

my_nn = tf.contrib.learn.DNNClassifier(feature_columns = [age,height,weight],
                                       hidden_units=[10,10,10],
                                       activation_fn=tf.nn.relu,
                                       dropout=0.2,
                                       n_classes = 3,
                                       optimizer = "Adam")

您不需要编写任何进一步的代码来指示TensorFlow如何训练模型，计算损失或返回预测;该逻辑已经被包裹到DNNClassifier中。

相比之下，当您从头开始创建自己的Estimator时，构造函数只接受模型配置，model_fn和params两个高级参数：

nn = tf.contrib.learn.Estimator(
    model_fn = model_fn,
    params = model_params)

model_fn:一个包含所有上述逻辑以支持training，evaluation和prediction的函数对象。您负责实现该功能。下一节，构建model_fn包括详细创建模型函数。

params：将被传递到model_fn的超参数（例如，学习率，缺省值）的可选指令。

注意：就像tf.contrib.learn的预定义的回归和分类器一样，Estimator初始化器也接受一般的配置参数model_dir和config。

对于鲍鱼年龄预测值，该模型将接受一个超参数：学习率。定义LEARNING_RATE作为代码开头的常量（以黑体突出显示），紧跟在日志记录配置之后：

tf.logging.set_verbosity(tf.logging.INFO)
# Learning rate for the model
LEARNING_RATE = 0.001

注意：这里，LEARNING_RATE设置为0.001，但您可以根据需要调整此值，以在模型训练期间获得最佳效果。

然后，将以下代码添加到main（）中，该代码创建包含学习率的键值对model_params并实例化Estimator：

#Set model params
model_params = {"learning_rate":LEARNING_RATE}
nn = tf.contrib.learn.Estimator(
    model_fn = model_fn,params = model_params
)

Constructing the model_fn

Estimator API模型函数的基本框架如下所示：

def model_fn(features,targets,mode,params):
    #Logic to do the following:
    #1. Configure the model via TensorFlow operations
    #2. Define the loss function for training/evaluation
    #3. Define the training operation/optimizer
    #4. Generrate predictions
    return predictions,loss,train_op

model_fn必须接收三个参数：

features：一个包含特征的键值对，通过fit()，evaluate()或者predict()传递给模型
targets：一个能包含labels的Tensor，通过fit()，evaluate()或者predict()传递给模型，传递给predict()时则为空，因为这些值是模型应该判断的。
mode：下列ModeKeys字符串值之一指示model_fn被调用的上下文：
- tf.contrib.learn.ModeKeys.TRAIN：在train模式下调用model_fn，例如通过fit（）调用。
- tf.contrib.lean.ModeKeys.EVAL：在evaluation模式下调用model_fn，例如，通过evaluate（）调用。
- tf.contrib.learn.ModeKeys.INFER：在inference模式中调用了model_fn，例如，通过一个predict（）调用。

model_fn也可以接受包含用于训练的超参数的参数的params参数（如上面的框架所示）。

该功能的主体执行以下任务（在以下部分中详细描述）：

配置模型 - 在这里，对于鲍鱼预测器，这将是一个神经网络。
定义损失函数，用于计算模型的预测与目标值的匹配程度。
定义训练操作，指定优化算法以最小化由损失函数计算的损失值。

最后，根据运行model_fn的模式，它必须返回以下三个值中的一个或多个值：

predictions（在INFER和EVAL模式中需要）：将您选择的关键名称映射到包含模型预测的Tensors的dict，例如：

predictions = {"results":tensor_of_predictions}

    - 在INFER模式中，从model_fn返回的dict将由predict()返回，因此您可以按照要使用的格式构造它。
    - 在EVAL模式下，由度量函数使用dict来计算度量。 传递给evaluate（）的metrics参数的任何MetricSpec对象必须具有与预测中的相应预测的键名称相匹配的prediction_key。

loss（在EVAL和TRAIN模式下需要），包含标量损失值的Tensor：在所有输入示例中计算出模型损失函数的输出（在模型的定义损失的后面更深入地讨论）。这在TRAIN模式下用于错误处理和记录，并在EVAL模式中自动包含为度量。
train_op：仅在train mode下被使用，表示training的一步操作

Configuring a neural network with tf.contrib.layers

构建神经网络需要创建和连接输入层，隐藏层和输出层。

输入层是一系列节点（一个用于模型中的每个特征），将接受传递给features参数中的model_fn的feature数据。如果feature包含所有特征数据的n维Tensor（如果x和y数据集直接传递给fit（），evaluate（）和predict（）），则可以用作输入层。如如果features通过输入函数，包含了传递给模型的feature columns字典，你可以使用tf.contrib.layers的input_from_feature_columns()函数，将其转换为一个输入层的tensor。

input_layer = tf.contrib.layers.input_from_feature_columns(columns_to_tensors=features,feature_columns = [age,height,weight])

如上所示，input_from_feature_columns（）需要两个必需的参数：

columns_to_tensors：将模型的FeatureColumns映射到包含相应功能数据的Tensors。这正是传递给feature参数中的model_fn的。
feature_columns。上述示例中的model-age，height和weight中的所有FeatureColumn列表。

然后，神经网络的输入层必须经由对前一层的数据执行非线性变换的激活函数连接到一个或多个隐藏层。最后的隐藏层然后连接到输出层，模型中的最后一层。 tf.contrib.layers为构建完全连接的层提供了以下便利功能：

relu(inputs, num_outputs)，使用ReLu激活功能（tf.nn.relu）创建一个完全连接到上一层输入的num_outputs节点：

hidden_layer = tf.contrib.layers.relu(inputs = input_layer,num_outputs = 10)

relu6(inputs,num_outputs)，使用ReLu 6激活功能（tf.nn.relu6）创建一个完全连接到上一层hidden_layer的num_outputs节点的层次：

second_hidden_layer = tf.contrib.layers.relu6(inputs = hidden_layer,num_outputs=20)

linear(inputs, num_outputs)，创建一个完全连接到上一层second_hidden_layer的num_outputs节点，没有激活功能，只需一个线性转换：

output_layer = tf.contrib.layers.linear(inputs=second_hidden_layer,num_output=3)

所有这些功能都是更通用的full_connected()函数的partial，可用于使用其他激活功能添加完全连接的图层，例如：

output_layer = tf.contrib.layers.fully_connected(inputs = second_hidden_layer,
                                                num_outputs=10,
                                                activation_fn = tf.sigmoid)

上面的代码创建了神经网络层output_layer，它完全连接到具有sigmoid激活函数
tf.sigmoid
的second_hidden_layer。有关TensorFlow中可用的预定义激活函数的列表，请参阅API文档。

将它们放在一起，以下代码为鲍鱼预测器构建完整的神经网络，并捕获其预测：

def model_fn(features,targets,mode,params):
"""model function for estimator"""
#Connect the first hidden layer to input layer with relu activation
first_hidden_layer = tf.contrib.layers.relu(features,10)
#Connect the second hidden layer to first hidden layer with relu
second_hidden_layer = tf.contrib.layers.relu(first_hidden_layer,10)
#Connect the output layer to second hidden layer(no activation fn)
output_layer = tf.contrib.layers.linear(second_hidden_layer,1)
#Reshape output layer to 1-dim Tensor to return predictions
predictions = tf.reshape(output_layer,[-1])
predictions_dict = {"age",predictions}

在这里，由于您将通过x和y参数将abalone数据集直接传递给fit（），evaluate（）和predict（），所以输入层是feature Tensor传递给model_fn。网络包含两个隐藏层，每层有10个节点和一个ReLu激活功能。输出层不包含激活函数，并重新整形

Defining loss for the model

model_fn必须返回包含损失值的Tensor，它量化模型预测在训练和评估运行期间反映目标值的程度。 tf.contrib.losses模块提供了使用各种指标计算损失的便利功能，包括：

absolute_difference(predictions, targets)，使用absolute-differece formula绝对差分公式（也称为L1损失）计算损失。
log_loss(predictions,targets)，通过logistic loss forumula计算损失函数（通常应用在logistic regression中）
mean_squared_error(predictions, targets)，通过mean squared error使用均方误差（MSE;也称为L2损耗）。

以下示例使用mean_squared_error()为鲍鱼model_fn添加了损失定义：

def model_fn(features,targets,mode,params):
"""Model function for Estimator."""
#Connect the first hidden layer to input layer with relu activation
first_hidden_layer = tf.contrib.layers.relu(features,10)
#Connect the second hidden layer to first hidden layer with relu
second_hidden_layer = tf.contrib.layers.relu(first_hidden_layer.10)
#Connect the output layer to second hidden layer(no activation fn)
output_layer = tf.contrib.layers.linear(second_hidden_layer,1)
#Reshape output layer to 1-dim Tensor to return predictions
predictions = tf.reshape(output_layer,[-1])
predictions_dict = {"age",predictions}
#Calculate loss using mean squared error
loss = tf.contrib.losses.mean_squared_error(predictions,targets)

有关loss函数的完整列表，请参阅tf.contrib.loss的API文档，以及有关支持的参数和用法的更多详细信息。

Defining the training op for the model

training op定义了优化算法，TensorFlow在将模型拟合到训练数据时会使用的。通常在train时，目标是尽量减少损失。 tf.contrib.layers API提供了一个函数optimize_loss，该函数返回一个可以做到这一点的训练操作。 optimize_loss有四个必需的参数：

loss：这个loss值可以被model_fn计算出来（参见Defining Loss for the Model）
global_step：整数变量表示每个模型训练运行的递增步数计数器。可以通过get_global_step()函数轻松地在TensorFlow中创建/增加
learning_rate：训练时优化算法使用的学习率（也称为步长）超参数。
optimizer：训练期间使用的优化算法。优化器可以接受以下任何字符串值，表示tf.contrib.layers.optimizer中预定义的优化算法：
- SGD: 实现随机梯度下降算法（tf.GradientDescentOptimizer1）
- Adagrad：实现AdaGrad optimization algorithm(tf.train.AdagradOptimizer)
- Ftrl：实现FTRL-Proximal算法（tf.train.FtrlOptimizer）
- Momentum：实现随机梯度下降动量(tf.train.MomentumOptimizer)
- RMSProp：实现RMSprop算法(tf.train.RMSPropOptimizer)
  注意：optimize_loss函数支持额外的可选参数，以进一步配置优化器，例如实现衰减。有关更多信息，请参阅API文档。

以下代码使用在定义模型的损失中计算的损失值，传递给params中的函数的学习速率和SGD优化器来定义鲍鱼model_fn的训练操作。对于global_step，tf.contrib.framework中的方便函数get_global_step()负责生成一个整数变量：

train_op = tf.contrib.layers.optimize_loss(
                    loss = loss,
                    global_step = tf.contrib.framework.get_global_step(),
                    learning_rate=params["learning_rate"],
                    optimizer="SGD")

The complete abalone model_fn

这是鲍鱼年龄预测器的最终完整的model_fn。以下代码配置神经网络; 定义损失和训练操作; 并返回predictions_dict，loss和train_op：

def model_fn(features,targets,mode,params):
"""Model function for Estimator."""
#Connect the first hidden layer to input layer with relu activation
first_hidden_laye = tf.contrib.layers.relu(features,10)
#Connect the second hidden layer to first hidden layer with relu
second_hidden_layer = tf.contrib.layers.relu(first_hidden_layer,10)
#Connet the output layer to second hidden layer(no activation fn)
output_layer = tf.contrib.layers.linear(second_hidden_layer,1)
#Reshape output layer to 1-dim Tensor to return predictions
predictions = tf.reshape(output_layer,[-1])
predictions_dict = {"ages":predictions}
#Calculate loss using mean squared error
loss = tf.contrib.losses.mean_squared_error(predictions,targets)
train_op = tf.contrib.layers.optimize_loss(
        loss = loss,
        global_step = tf.contrib.framework.get_global_step(),
        learning_rate = params["learning_rate"],
        optimizer = "SGD")
return predictions_dict,loss,train_op

Running the Abalone Model

您已经为鲍鱼预测器实例化了一个Estimator，并在model_fn中定义了它的行为; 剩下的一切就是训练，评估和预测。

将以下代码添加到main（）的末尾以适应神经网络的训练数据并评估准确性：

#Fit
nn.fit(x=training_set.data,y=training_set.target,steps = 5000)
#Score accuracy
ev = nn.evaluate(x = test_set.data,y = test_set.target,steps = 1)
loss_score = ev["loss"]
print("Loss:%s"%loss_score)

接下来你会得到下面的输出：

...
INFO:tensorflow:loss = 4.86658, step = 4701
INFO:tensorflow:loss = 4.86191, step = 4801
INFO:tensorflow:loss = 4.85788, step = 4901
...
INFO:tensorflow:Saving evaluation summary for 5000 step: loss = 5.581
Loss: 5.581

报告的损失分数是在ABALONE_TEST数据集上运行时从model_fn返回的均方误差。

要预测ABALONE_PREDICT数据集的年龄，请将以下内容添加到main()：

#Print out predictions
predictions = nn.predict(x=prediction_set.data,as_iterable = True)
for i,p in enumerate(predictions):
    print("Prediction %s: %s"%(i+1,p["ages"]))

在这里，predict（）函数将结果作为一个迭代返回预测。 for循环枚举并打印出结果。重新运行代码，您应该看到类似于以下内容的输出：

Prediction 1: 4.92229
Prediction 2: 10.3225
Prediction 3: 7.384
Prediction 4: 10.6264
Prediction 5: 11.0862
Prediction 6: 9.39239
Prediction 7: 11.1289