试用 tf.function 加速代码_tf.function可以使代码更快进行吗-CSDN博客

本文链接：https://blog.csdn.net/chary8088/article/details/89924630

在 TensorFlow 2.0 中，默认情况下，Eager Execution 处于启用状态。这为您提供一个非常直观灵活的界面，可以提升运行一次性操作的简易性和速度，但会降低性能和可部署性。

为了获得峰值性能并使您的模型可以部署在任何位置，我们提供 tf.function，您可以将其用作工具，从程序中生成图表。

from __future__ import absolute_import, division, print_function, unicode_literals

!pip install -q tensorflow==2.0.0-alpha0
import tensorflow as tf

# 一个函数相当于一项操作

@tf.function
def add(a, b):
return a + b

add(tf.ones([2, 2]), tf.ones([2, 2])) # [[2., 2.], [2., 2.]]

<tf.Tensor: id=16, shape=(2, 2), dtype=float32, numpy=

array([[2., 2.],

[2., 2.]], dtype=float32)>

您定义的 tf.function 相当于核心的 TensorFlow 操作：您可以立即执行该函数、可以在图表中使用该函数、该函数具有梯度，等等。

# 函数具有梯度

@tf.function
def add(a, b):
return a + b

v = tf.Variable(1.0)
with tf.GradientTape() as tape:
result = add(v, 1.0)
tape.gradient(result, v)

<tf.Tensor: id=44, shape=(), dtype=float32, numpy=1.0>

# 您可以在函数中使用函数

@tf.function
def dense_layer(x, w, b):
return add(tf.matmul(x, w), b)

dense_layer(tf.ones([3, 2]), tf.ones([2, 2]), tf.ones([2]))

<tf.Tensor: id=74, shape=(3, 2), dtype=float32, numpy=

array([[3., 3.],

[3., 3.],

[3., 3.]], dtype=float32)>

多态性

tf.function 试图成为和 Python 函数一样通用的函数。您可以使用各种签名调用 Python 函数，并且 Python 通常会进行一些合理的操作。即使 tf.function 生成的底层 TensorFlow 图表只适用于其签名中的特定类型，也会为您处理此类多态。

您可以调用具有不同类型参数的函数来查看发生的操作。

# 函数具有多态性

@tf.function
def add(a):
return a + a

print("add 1", add(1))
print("add 1.1", add(1.1))
print("add string tensor", add(tf.constant("a")))
c = add.get_concrete_function(tf.TensorSpec(shape=None, dtype=tf.string))
c(a=tf.constant("a")) # aa

add 1 tf.Tensor(2, shape=(), dtype=int32)

add 1.1 tf.Tensor(2.2, shape=(), dtype=float32)

add string tensor tf.Tensor(b'aa', shape=(), dtype=string)

<tf.Tensor: id=104, shape=(), dtype=string, numpy=b'aa'>

# 对于含有许多小操作的图表而言，函数的运行速度比即时代码更快

import timeit
conv_layer = tf.keras.layers.Conv2D(100, 3)

@tf.function
def conv_fn(image):
return conv_layer(image)

image = tf.zeros([1, 200, 200, 100])
# 预热
conv_layer(image); conv_fn(image)
print("Eager conv:", timeit.timeit(lambda: conv_layer(image), number=10))
print("Function conv:", timeit.timeit(lambda: conv_fn(image), number=10))
print("Note how there's not much difference in performance for convolutions")

lstm_cell = tf.keras.layers.LSTMCell(10)

@tf.function
def lstm_fn(input, state):
return lstm_cell(input, state)

input = tf.zeros([10, 10])
state = [tf.zeros([10, 10])] * 2
# 预热
lstm_cell(input, state); lstm_fn(input, state)
print("eager lstm:", timeit.timeit(lambda: lstm_cell(input, state), number=10))
print("function lstm:", timeit.timeit(lambda: lstm_fn(input, state), number=10))