TensorFlow 调试程序 (tfdbg) - First

最新推荐文章于 2021-09-02 10:50:56 发布

Yongqiang Cheng

最新推荐文章于 2021-09-02 10:50:56 发布

阅读量601

点赞数 1

分类专栏： TensorFlow - Keras 文章标签： TensorFlow 调试程序 (tfdbg)

世上没有白读的书，每一页都算数。

本文链接：https://blog.csdn.net/chengyq116/article/details/97289973

版权

TensorFlow - Keras 专栏收录该内容

120 篇文章 2 订阅

订阅专栏

TensorFlow 调试程序 (tfdbg) - First

https://tensorflow.google.cn/guide/debugger

TensorFlow 指南 - TensorFlow 工作原理
https://tensorflow.google.cn/guide/

tfdbg 是 TensorFlow 的专用调试程序。借助该调试程序，您可以在训练和推理期间查看运行中 TensorFlow 图的内部结构和状态，由于 TensorFlow 的计算图模式，使用通用调试程序 (Python 的 pdb) 很难完成调试。
https://tensorflow.google.cn/api_docs/python/tfdbg

本指南介绍 tfdbg 的命令行界面 (CLI)。tfdbg 的图形界面 (GUI) (TensorBoard 调试程序插件)，请访问相关 README 文件。
https://github.com/tensorflow/tensorboard/blob/master/tensorboard/plugins/debugger/README.md

1. 使用 tfdbg 封装 TensorFlow 会话

要向示例中的 tfdbg 添加支持，我们只需添加下列代码行，并使用调试程序封装容器封装会话对象。此代码已添加到 debug_mnist.py 中，因此您可以在命令行中使用 --debug 标记激活 tfdbg CLI。
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/debug/examples/debug_mnist.py

# Let your BUILD target depend on "//tensorflow/python/debug:debug_py"
# (You don't need to worry about the BUILD dependency if you are using a pip
#  install of open-source TensorFlow.)
from tensorflow.python import debug as tf_debug

sess = tf_debug.LocalCLIDebugWrapperSession(sess)

此封装容器与会话具有相同的界面，因此启用调试时不需要对代码进行其他更改。该封装容器还提供其他功能，包括：

在每次 Session.run() 调用前后调出 CLI，以便您控制执行情况和检查图的内部状态。
允许您为张量值注册特殊 filters，以便诊断问题。

1.1 source code

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

'''
Basic Operations example using TensorFlow library.
Project: https://github.com/aymericdamien/TensorFlow-Examples/
'''

import tensorflow as tf

# Basic Operations with variable as graph input
# The value returned by the constructor represents the output of the Variable op. (define as input when running session)

# tf Graph input
a = tf.placeholder(tf.int16)
b = tf.placeholder(tf.int16)

# Define some operations
add = tf.add(a, b)
mul = tf.multiply(a, b)

# Launch the default graph.
with tf.Session() as sess:
    # Run every operation with variable input
    print("Addition with variables: %i" % sess.run(add, feed_dict={a: 2, b: 3}))
    print("Multiplication with variables: %i" % sess.run(mul, feed_dict={a: 2, b: 3}))

/usr/bin/python2.7 /home/strong/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow/yongqiang.py
2019-07-25 11:39:07.997548: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-07-25 11:39:08.079679: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-25 11:39:08.079919: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties: 
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.7335
pciBusID: 0000:01:00.0
totalMemory: 7.92GiB freeMemory: 7.43GiB
2019-07-25 11:39:08.079929: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
Addition with variables: 5
Multiplication with variables: 6

Process finished with exit code 0

1.2 tfdbg 添加支持

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

'''
Basic Operations example using TensorFlow library.
Project: https://github.com/aymericdamien/TensorFlow-Examples/
'''

import tensorflow as tf
# Let your BUILD target depend on "//tensorflow/python/debug:debug_py"
# (You don't need to worry about the BUILD dependency if you are using a pip install of open-source TensorFlow.)
from tensorflow.python import debug as tf_debug

# Basic Operations with variable as graph input
# The value returned by the constructor represents the output of the Variable op. (define as input when running session)

# tf Graph input
a = tf.placeholder(tf.int16)
b = tf.placeholder(tf.int16)

# Define some operations
add = tf.add(a, b)
mul = tf.multiply(a, b)

# Launch the default graph.
with tf.Session() as sess:
    sess = tf_debug.LocalCLIDebugWrapperSession(sess)
    # Run every operation with variable input
    print("Addition with variables: %i" % sess.run(add, feed_dict={a: 2, b: 3}))
    print("Multiplication with variables: %i" % sess.run(mul, feed_dict={a: 2, b: 3}))

2. 使用 tfdbg 调试模型训练

我们尝试再次训练模型，但这次添加 --debug 标记：

python -m yongqiang.py --debug
strong@foreverstrong:~/tensorflow_work/R2CNN_Faster-RCNN_Tensorflow$ python -m yongqiang.py --debug

调试封装容器会话会在将要执行第一次 Session.run() 调用时提示您，而屏幕上会显示关于获取的张量和 feed 字典的信息。

在这里插入图片描述
这就是我们所说的 run-start CLI。它先列出对当前 Session.run 调用的 feed 和 fetch，然后再执行任何操作。

如果因屏幕尺寸太小而无法显示完整的消息内容，您可以调整屏幕大小。

使用 PageUp/PageDown/Home/End 键可以浏览屏幕上的输出。在大部分没有这些键的键盘上，使用 Fn + Up/Fn + Down/Fn + Right/Fn + Left 也可以。

在命令提示符处输入 run 命令(或只输入 r)：

tfdbg> run

run 命令会让 tfdbg 一直执行，直到下一次 Session.run() 调用结束，而此调用会使用测试数据集计算模型的准确率。tfdbg 会扩展运行时图来转储所有中间张量。运行结束后，tfdbg 会在 run-end CLI 中显示所有转储的张量值。

在执行 run 之后运行命令 lt 也可以获得此张量列表。

exit: 退出调试状态
invoke_stepper: 单步调试状态

除了上面列出的命令外，tfdbg CLI 还提供了下列其他功能：

要浏览之前的 tfdbg 命令，请输入几个字符，然后按向上或向下箭头键。tfdbg 会向您显示以这些字符开头的命令的历史记录。
要浏览屏幕输出的历史记录，请执行下列任一操作：
- 使用 prev 和 next 命令。
- 点击屏幕左上角附近带下划线的 <-- 和 --> 链接。
命令 (和一些命令参数) 的 Tab 补齐功能。
要将屏幕输出重定向到文件 (而不是屏幕)，请使用 bash 样式重定向结束命令。例如，以下命令会将 pt 命令的输出重定向到 /tmp/xent_value_slices.txt 文件：

tfdbg> pt cross_entropy/Log:0[:, 0:10] > /tmp/xent_value_slices.txt

Yongqiang Cheng

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
打赏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录