在文章tensorflow(5)将ckpt转化为pb文件并利用tensorflow/serving实现模型部署及预测中,笔者以一个简单的例子,来介绍如何在tensorflow中将ckpt转化为pb文件,并利用tensorflow/serving来实现模型部署及预测。本文将会介绍如何使用tensorflow/serving来实现单模型部署、多模型部署、模型版本控制以及模型预测。
我们将会以Docker形式使用tensorflow/serving,因此需要在你的环境中安装好Docker。我们以tensorflow/serving:1.14.0为例,所以我们需要拉取这个镜像到本地:
docker pull tensorflow/serving:1.14.0
本文演示的项目结构如下:
单模型部署及预测
创建模型
我们先使用Tensorflow创建模型:z=x*y+t
,其中x,y等于2.0,t为变量,z为输出结果。创建模型脚本(single_model.py)的完整代码如下:
# -*- coding: utf-8 -*-
import tensorflow as tf
g = tf.Graph()
with g.as_default() as g:
x = tf.Variable(2.0, dtype=tf.float32, name="x")
y = tf.Variable(2.0, dtype=tf.float32, name="y")
xy = x * y
t = tf.placeholder(shape=None, dtype=tf.float32, name="t")
z = tf.add(xy, t, name="z")
with tf.Session(graph=g) as sess:
sess.run(tf.global_variables_initializer())
result = sess.run(z, feed_dict={t: 1.0})
print("result: ", result)
# save the model
saver = tf.train.Saver()
saver.save(sess, save_path='./ckpt_models/add/add.ckpt')
输出结果为:
result: 5.0
生成pb文件
上述模型生成pb文件的完整代码(single_ckpt_2_pb.py)如下:
# -*- coding: utf-8 -*-
import tensorflow as tf
from tensorflow.python import saved_model
export_path = "pb_models/add/1"
graph = tf.Graph()
saver = tf.train.import_meta_graph("./ckpt_models/add/add.ckpt.meta", graph=graph)
with tf.Session(graph=graph) as sess:
saver.restore(sess, tf.train.latest_checkpoint("./ckpt_models/add"))
saved_model.simple_save(session=sess,
export_dir=export_path,
inputs={"t": graph.get_operation_by_name('t').outputs[0]},
outputs={"z": graph.get_operation_by_name('z').outputs[0]})
运行上述代码,会在pb_models文件夹下生成文件如下:
$ tree pb_models/
pb_models/
├── add
│ └── 1
│ ├── saved_model.pb
│ └── variables
│ ├── variables.data-00000-of-00001
│ └── variables.index
模型部署
接着我们使用Docker部署该模型,命令如下:
docker run -t --rm -p 8551:8501 -v "absolute_path_to_pb_models/pb_models/add:/models/add" -e MODEL_NAME=add tensorflow/serving:1.14.0
其中absolute_path_to_pb_models为pb_models所在的完整路径。输出结果如下:
2021-01-06 04:14:32.790794: I tensorflow_serving/model_servers/server.cc:82] Building single TensorFlow model file config: model_name: add model_base_path: /models/add
2021-01-06 04:14:32.810015: I tensorflow_serving/model_servers/server_core.cc:462] Adding/updating models.
2021-01-06 04:14:32.810054: I tensorflow_serving/model_servers/server_core.cc:561] (Re-)adding model: add
2021-01-06 04:14:32.910293: I tensorflow_serving/core/basic_manager.cc:739] Successfully reserved resources to load servable {name: add version: 1}
2021-01-06 04:14:32.910317: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: add version: 1}
2021-01-06 04:14:32.910327: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: add version: 1}
2021-01-06 04:14:32.910373: I external/org_tensorflow/tensorflow/contrib/session_bundle/bundle_shim.cc:363] Attempting to load native SavedModelBundle in bundle-shim from: /models/add/1
2021-01-06 04:14:32.910385: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /models/add/1
2021-01-06 04:14:32.910573: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2021-01-06 04:14:32.916524: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2021-01-06 04:14:32.956639: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:202] Restoring SavedModel bundle.
2021-01-06 04:14:33.075419: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:311] SavedModel load for tags { serve }; Status: success. Took 165028 microseconds.
2021-01-06 04:14:33.075456: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:103] No warmup data file found at /models/add/1/assets.extra/tf_serving_warmup_requests
2021-01-06 04:14:33.075528: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: add version: 1}
2021-01-06 04:14:33.084877: I tensorflow_serving/model_servers/server.cc:324] Running gRPC ModelServer at 0.0.0.0:8500 ...
[warn] getaddrinfo: address family for nodename not supported
2021-01-06 04:14:33.117161: I tensorflow_serving/model_servers/server.cc:344] Exporting HTTP/REST API at:localhost:8501 ...
[evhttp_server.cc : 239] RAW: Entering the event loop ...
出现上述结果即表示模型部署成功,其中8500为gRPC端口,8501为HTTP端口,这里我们只使用了HTTP端口。
模型预测
查看模型部署状态的curl命令:
curl http://192.168.1.193:8551/v1/models/add
输出结果为:
{
"model_version_status": [
{
"version": "1",
"state": "AVAILABLE",
"status": {
"error_code": "OK",
"error_message": ""
}
}
]
}
查看模型部署元数据的curl命令:
curl http://192.168.1.193:8551/v1/models/add/metadata
输出结果为:
{
"model_spec": {
"name": "add",
"signature_name": "",
"version": "1"
},
"metadata": {
"signature_def": {
"signature_def": {
"serving_default": {
"inputs": {
"t": {
"dtype": "DT_FLOAT",
"tensor_shape": {
"dim": [],
"unknown_rank": true
},
"name": "t:0"
}
},
"outputs": {
"z": {
"dtype": "DT_FLOAT",
"tensor_shape": {
"dim": [],
"unknown_rank": true
},
"name": "z:0"
}
},
"method_name": "tensorflow/serving/predict"
}
}
}
}
}
从中我们可以看到模型的输入与输出。
使用curl命令进行模型预测:
curl --location --request POST 'http://192.168.1.193:8551/v1/models/add:predict' \
--header 'Content-Type: application/json' \
--data-raw '{
"instances": [{"t": 2.0}]
}'
输出结果为:
{
"predictions": [
6.0
]
}
如果使用Python进行模型预测,其代码(single_tf_serving.py)如下:
# -*- coding: utf-8 -*-
import requests
# 利用tensorflow/serving的HTTP接口请求进行预测
t = 2.0
tensor = {"instances": [{"t": t}]}
url = "http://192.168.1.193:8551/v1/models/add:predict"
req = requests.post(url, json=tensor)
if req.status_code == 200:
z = req.json()['predictions'][0]
print("model_add:", z)
多模型部署及预测
创建模型
我们先使用Tensorflow创建多个模型:z=x*y+t
,z=x*y-t
,z=x*y*t
,z=x*y/t
,,其中x,y等于2.0,t为变量,z为输出结果。创建模型脚本(multi_model.py)的完整代码如下:
# -*- coding: utf-8 -*-
import tensorflow as tf
# add model
with tf.Graph().as_default() as g:
x = tf.Variable(2.0, dtype=tf.float32, name="x")
y = tf.Variable(2.0, dtype=tf.float32, name="y")
xy = x * y
t = tf.placeholder(shape=None, dtype=tf.float32, name="t")
z = tf.add(xy, t, name="z")
with tf.Session(graph=g) as sess:
sess.run(tf.global_variables_initializer())
result = sess.run(z, feed_dict={t: 3.0})
print("result: ", result)
# save the model
saver = tf.train.Saver()
saver.save(sess, save_path='./ckpt_models/add/add.ckpt')
# substract model
with tf.Graph().as_default() as g:
x = tf.Variable(2.0, dtype=tf.float32, name="x")
y = tf.Variable(2.0, dtype=tf.float32, name="y")
xy = x * y
t = tf.placeholder(shape=None, dtype=tf.float32, name="t")
z = tf.subtract(xy, t, name="z")
with tf.Session(graph=g) as sess:
sess.run(tf.global_variables_initializer())
result = sess.run(z, feed_dict={t: 3.0})
print("result: ", result)
# save the model
saver = tf.train.Saver()
saver.save(sess, save_path='./ckpt_models/subtract/subtract.ckpt')
# multipy
with tf.Graph().as_default() as g:
x = tf.Variable(2.0, dtype=tf.float32, name="x")
y = tf.Variable(2.0, dtype=tf.float32, name="y")
xy = x * y
t = tf.placeholder(shape=None, dtype=tf.float32, name="t")
z = tf.multiply(xy, t, name="z")
with tf.Session(graph=g) as sess:
sess.run(tf.global_variables_initializer())
result = sess.run(z, feed_dict={t: 3.0})
print("result: ", result)
# save the model
saver = tf.train.Saver()
saver.save(sess, save_path='./ckpt_models/multiply/multiply.ckpt')
# divide model
with tf.Graph().as_default() as g:
x = tf.Variable(2.0, dtype=tf.float32, name="x")
y = tf.Variable(2.0, dtype=tf.float32, name="y")
xy = x * y
t = tf.placeholder(shape=None, dtype=tf.float32, name="t")
z = tf.divide(xy, t, name="z")
with tf.Session(graph=g) as sess:
sess.run(tf.global_variables_initializer())
result = sess.run(z, feed_dict={t: 3.0})
print("result: ", result)
# save the model
saver = tf.train.Saver()
saver.save(sess, save_path='./ckpt_models/divide/divide.ckpt')
输出结果如下:
result: 7.0
result: 1.0
result: 12.0
result: 1.3333334
生成pb文件
上述模型生成pb文件的完整代码(multi_ckpt_2_pb.py)如下:(注意,运行代码前需删除pb_models下的add目录,不然会报错。)
# -*- coding: utf-8 -*-
import tensorflow as tf
from tensorflow.python import saved_model
# change ckpt file to pb file
def model_export(model_name):
export_path = "pb_models/{}/1".format(model_name)
graph = tf.Graph()
saver = tf.train.import_meta_graph("./ckpt_models/{}/{}.ckpt.meta".format(model_name, model_name),
graph=graph)
with tf.Session(graph=graph) as sess:
saver.restore(sess, tf.train.latest_checkpoint("./ckpt_models/{}".format(model_name)))
saved_model.simple_save(session=sess,
export_dir=export_path,
inputs={"t": graph.get_operation_by_name('t').outputs[0]},
outputs={"z": graph.get_operation_by_name('z').outputs[0]})
model_export("add")
model_export("subtract")
model_export("multiply")
model_export("divide")
运行上述代码,会在pb_models文件夹下生成文件如下:
$ tree pb_models/
pb_models/
├── add
│ └── 1
│ ├── saved_model.pb
│ └── variables
│ ├── variables.data-00000-of-00001
│ └── variables.index
├── divide
│ └── 1
│ ├── saved_model.pb
│ └── variables
│ ├── variables.data-00000-of-00001
│ └── variables.index
├── models.config
├── multiply
│ └── 1
│ ├── saved_model.pb
│ └── variables
│ ├── variables.data-00000-of-00001
│ └── variables.index
└── subtract
└── 1
├── saved_model.pb
└── variables
├── variables.data-00000-of-00001
└── variables.index
12 directories, 13 files
模型部署
tensorflow/serving支持多个模型同时部署,其部署命令如下:
docker run -t -d --rm -p 8551:8501 -v "absolute_path_to_pb_models/pb_models:/models" tensorflow/serving:1.14.0 --model_config_file=/models/models.config
其中absolute_path_to_pb_models为pb_models所在的完整路径。需要注明models.config文件,正是在这个文件中,我们配置了多个模型的部署信息,文件内容如下:
model_config_list {
config {
name: "add"
base_path: "/models/add"
model_platform: "tensorflow"
},
config {
name: "subtract"
base_path: "/models/subtract"
model_platform: "tensorflow"
},
config {
name: "multiply"
base_path: "/models/multiply"
model_platform: "tensorflow"
},
config {
name: "divide"
base_path: "/models/divide"
model_platform: "tensorflow"
}
}
模型预测
使用Python脚本进行模型预测的代码(multi_tf_serving.py)如下:
# -*- coding: utf-8 -*-
import tensorflow as tf
from tensorflow.python import saved_model
# change ckpt file to pb file
def model_export(model_name):
export_path = "pb_models/{}/1".format(model_name)
graph = tf.Graph()
saver = tf.train.import_meta_graph("./ckpt_models/{}/{}.ckpt.meta".format(model_name, model_name),
graph=graph)
with tf.Session(graph=graph) as sess:
saver.restore(sess, tf.train.latest_checkpoint("./ckpt_models/{}".format(model_name)))
saved_model.simple_save(session=sess,
export_dir=export_path,
inputs={"t": graph.get_operation_by_name('t').outputs[0]},
outputs={"z": graph.get_operation_by_name('z').outputs[0]})
model_export("add")
model_export("subtract")
model_export("multiply")
model_export("divide")
输出结果如下:
model_add: 8.0
model_subtract: 0.0
model_multiply: 16.0
model_divide: 1.0
模型版本控制及预测
创建模型
tensorflow/serving还支持同一个模型的不同版本的部署。我们在这里创建一个add模型,其有三个版本,分别为:z=x*y+t
,z=x*y+2t
,z=x*y+3t
,,其中x,y等于2.0,t为变量,z为输出结果。创建模型脚本(version_control_model.py)的完整代码如下:
# -*- coding: utf-8 -*-
import tensorflow as tf
# 第一个模型
g = tf.Graph()
with g.as_default() as g:
x = tf.Variable(2.0, dtype=tf.float32, name="x")
y = tf.Variable(2.0, dtype=tf.float32, name="y")
xy = x * y
t = tf.placeholder(shape=None, dtype=tf.float32, name="t")
z = tf.add(xy, t, name="z")
with tf.Session(graph=g) as sess:
sess.run(tf.global_variables_initializer())
result = sess.run(z, feed_dict={t: 1.0})
print("result: ", result)
# save the model
saver = tf.train.Saver()
saver.save(sess, save_path='./ckpt_models/add/add.ckpt')
# 第二个模型
g = tf.Graph()
with g.as_default() as g:
x = tf.Variable(2.0, dtype=tf.float32, name="x")
y = tf.Variable(2.0, dtype=tf.float32, name="y")
xy = x * y
t = tf.placeholder(shape=None, dtype=tf.float32, name="t")
z = tf.add(xy, 2*t, name="z")
with tf.Session(graph=g) as sess:
sess.run(tf.global_variables_initializer())
result = sess.run(z, feed_dict={t: 1.0})
print("result: ", result)
# save the model
saver = tf.train.Saver()
saver.save(sess, save_path='./ckpt_models/add/add2.ckpt')
# 第三个模型
g = tf.Graph()
with g.as_default() as g:
x = tf.Variable(2.0, dtype=tf.float32, name="x")
y = tf.Variable(2.0, dtype=tf.float32, name="y")
xy = x * y
t = tf.placeholder(shape=None, dtype=tf.float32, name="t")
z = tf.add(xy, 3*t, name="z")
with tf.Session(graph=g) as sess:
sess.run(tf.global_variables_initializer())
result = sess.run(z, feed_dict={t: 1.0})
print("result: ", result)
# save the model
saver = tf.train.Saver()
saver.save(sess, save_path='./ckpt_models/add/add3.ckpt')
输出结果如下:
result: 5.0
result: 6.0
result: 7.0
生成pb文件
上述模型生成pb文件的完整代码(version_control_ckpt_2_pb.py)如下:(注意,运行代码前需删除pb_models下的add目录,不然会报错。)
# -*- coding: utf-8 -*-
import tensorflow as tf
from tensorflow.python import saved_model
# change ckpt file to pb file
def model_export(model_version, tf_version):
export_path = "pb_models/add/{}".format(tf_version)
graph = tf.Graph()
saver = tf.train.import_meta_graph("./ckpt_models/add/{}.ckpt.meta".format(model_version),
graph=graph)
with tf.Session(graph=graph) as sess:
saver.restore(sess, tf.train.latest_checkpoint("./ckpt_models/add"))
saved_model.simple_save(session=sess,
export_dir=export_path,
inputs={"t": graph.get_operation_by_name('t').outputs[0]},
outputs={"z": graph.get_operation_by_name('z').outputs[0]})
model_export("add", 1)
model_export("add2", 2)
model_export("add3", 3)
运行上述代码,会在pb_models文件夹下生成文件如下:
pb_models/
├── add
│ ├── 1
│ │ ├── saved_model.pb
│ │ └── variables
│ │ ├── variables.data-00000-of-00001
│ │ └── variables.index
│ ├── 2
│ │ ├── saved_model.pb
│ │ └── variables
│ │ ├── variables.data-00000-of-00001
│ │ └── variables.index
│ └── 3
│ ├── saved_model.pb
│ └── variables
│ ├── variables.data-00000-of-00001
│ └── variables.index
└── models.config
7 directories, 10 files
模型部署
tensorflow/serving支持模型的多个版本同时部署,其部署命令如下:
docker run -t -d --rm -p 8551:8501 -v "absolute_path_to_pb_models/pb_models:/models" tensorflow/serving:1.14.0 --model_config_file=/models/models.config
其中absolute_path_to_pb_models为pb_models所在的完整路径。需要注明models.config文件,正是在这个文件中,我们配置了add模型的版本部署信息,文件内容如下:
model_config_list {
config {
name: "add"
base_path: "/models/add"
model_platform: "tensorflow"
model_version_policy{
all{
}
}
}
}
注意model_version_policy
字段,这里选择all,表示部署所有版本的模型。model_version_policy
:模型版本策略,支持如下三种:
- all:同时服务所有版本
- latest:默认,默认值:1
- specific:服务指定版本,可同时配置多个
模型预测
使用Python脚本进行模型预测的代码(version_control_tf_serving.py)如下:
# -*- coding: utf-8 -*-
import requests
# 利用tensorflow/serving的HTTP接口请求进行预测
def model_predict(model_version):
t = 4.0
tensor = {"instances": [{"t": t}]}
url = "http://192.168.1.193:8551/v1/models/add/versions/{}:predict".format(model_version)
req = requests.post(url, json=tensor)
if req.status_code == 200:
z = req.json()['predictions'][0]
print("model_version{}: ".format(model_version), z)
model_predict("1")
model_predict("2")
model_predict("3")
输出结果如下:
model_version1: 8.0
model_version2: 12.0
model_version3: 16.0
注意不同版本的模型请求url是不同的,规则如下:
/v1/models/<model name>/versions/<version number>
总结
本项目已经上传至Github,网址为:https://github.com/percent4/tensorflow_serving_examples 。
感谢大家的阅读,后续将介绍如何使用tensorflow/serving来部署BERT模型。