tf.contrib.slim学习之Evaluation Model

最新推荐文章于 2023-07-03 21:44:29 发布

Fang Suk

最新推荐文章于 2023-07-03 21:44:29 发布

阅读量1.6k

点赞数

分类专栏： Tensorflow TF-slim学习深度学习文章标签： TF-slim slim evaluation tensorflow

本文链接：https://blog.csdn.net/MrR1ght/article/details/81110616

版权

深度学习同时被 3 个专栏收录

33 篇文章 1 订阅

订阅专栏

Tensorflow

16 篇文章 0 订阅

订阅专栏

TF-slim学习

4 篇文章 0 订阅

订阅专栏

当训练完一个模型，或者模型正在训练时，我们想要评估模型在实际应用的表现，可通过两个部分来实现模型评估

定义评估标准（度量模型性能的指标）（如Accuracy,Recall_5）
评估代码用于读取数据，执行inference，计算对应于GT的分数，并保存评估的分数

一 Metric

(1)metrics：用于定义评估模型性能的标准，如F1分数，IOU;

TF-slim提供了一系列的度量操作metrics使得模型评估变的非常方便；TF-slim计算评估的数值可分为三步：

初始化：初始化用于计算指标的变量
聚合：执行用于计算指标的操作（总和等）
完成:(可选）执行任何最终操作以计算度量值。例如，计算方式，分钟，最大值等。

例如计算mean_absolute_error时，TF-slim计算的步骤：

初始化变量count=0,total=0
聚合：根据predictions和labels（如一个batch）计算绝对误差absolu_error，并加到total中，同事count=count+1
最后：用total/count得到mean_absolute_error

（2）定义metrics的栗子

images, labels = LoadTestData(...)
predictions = MyModel(images)

mae_value_op, mae_update_op = slim.metrics.streaming_mean_absolute_error(predictions, labels)
mre_value_op, mre_update_op = slim.metrics.streaming_mean_relative_error(predictions, labels)
pl_value_op, pl_update_op = slim.metrics.percentage_less(mean_relative_errors, 0.3)

在创建一个metric时，会返回两个value: value_op和update_op

value_op:幂指操作，返回当前的metric的值value
update_op:执行聚合操作（上面提到的），然后返回metric的值value（如用于在step循环中累加metric的值）

（3）两个便捷管理metrics的函数

定义了多个评估指标metrics时，使得想要跟踪每个指标的value_op,和update_op变得困难，TF-slim提供了两个函数便于管理metrics的value_op和update_op，其实就是将多个指标的value_op和update_op分别放到两个list,或者是两个字典中

# Aggregates the value and update ops in two lists:
value_ops, update_ops = slim.metrics.aggregate_metrics(
    slim.metrics.streaming_mean_absolute_error(predictions, labels),
    slim.metrics.streaming_mean_squared_error(predictions, labels))

# Aggregates the value and update ops in two dictionaries:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    "eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),
    "eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),
})

（4）多个评估指标metrics的例子

import tensorflow as tf
import tensorflow.contrib.slim.nets as nets

slim = tf.contrib.slim
vgg = nets.vgg


# Load the data
images, labels = load_data(...)

# Define the network
predictions = vgg.vgg_16(images)

# Choose the metrics to compute:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    "eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),
    "eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),
})

# Evaluate the model using 1000 batches of data:
num_batches = 1000

with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  sess.run(tf.local_variables_initializer())

  for batch_id in range(num_batches):
    sess.run(names_to_updates.values())

  metric_values = sess.run(names_to_values.values())
  for metric, value in zip(names_to_values.keys(), metric_values):
    print('Metric %s has value: %f' % (metric, value))

二循环评估

为了简化评估流程，TF-slim提供了评估模块（evaluation.py），这个模块包含一些使用metric(metric_op.py模块定义的)的有助于编写评估代码的函数，其中一个函数会定期的运行评估，计算一个batch_data的mestric指标的值，将metric指标的值输出到标准输出并保存到summeries中，

import tensorflow as tf

slim = tf.contrib.slim

# Load the data
images, labels = load_data(...)

# Define the network
predictions = MyModel(images)

# Choose the metrics to compute:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    'accuracy': slim.metrics.accuracy(predictions, labels),
    'precision': slim.metrics.precision(predictions, labels),
    'recall': slim.metrics.recall(mean_relative_errors, 0.3),
})

# Create the summary ops such that they also print out to std output:
summary_ops = []
for metric_name, metric_value in names_to_values.iteritems():
  op = tf.summary.scalar(metric_name, metric_value)
  op = tf.Print(op, [metric_value], metric_name)
  summary_ops.append(op)

num_examples = 10000
batch_size = 32
num_batches = math.ceil(num_examples / float(batch_size))

# Setup the global step.
slim.get_or_create_global_step()

output_dir = ... # Where the summaries are stored.
eval_interval_secs = ... # How often to run the evaluation.
slim.evaluation.evaluation_loop(
    'local',
    checkpoint_dir,
    log_dir,
    num_evals=num_batches,
    eval_op=names_to_updates.values(),
    summary_op=tf.summary.merge(summary_ops),
    eval_interval_secs=eval_interval_secs)