tensorflow(6)利用tensorflow/serving实现模型部署及预测

  在文章tensorflow(5)将ckpt转化为pb文件并利用tensorflow/serving实现模型部署及预测中,笔者以一个简单的例子,来介绍如何在tensorflow中将ckpt转化为pb文件,并利用tensorflow/serving来实现模型部署及预测。本文将会介绍如何使用tensorflow/serving来实现单模型部署、多模型部署、模型版本控制以及模型预测。
  我们将会以Docker形式使用tensorflow/serving,因此需要在你的环境中安装好Docker。我们以tensorflow/serving:1.14.0为例,所以我们需要拉取这个镜像到本地:

docker pull tensorflow/serving:1.14.0

  本文演示的项目结构如下:

项目结构

单模型部署及预测

创建模型

  我们先使用Tensorflow创建模型:z=x*y+t,其中x,y等于2.0,t为变量,z为输出结果。创建模型脚本(single_model.py)的完整代码如下:

# -*- coding: utf-8 -*-
import tensorflow as tf

g = tf.Graph()
with g.as_default() as g:
    x = tf.Variable(2.0, dtype=tf.float32, name="x")
    y = tf.Variable(2.0, dtype=tf.float32, name="y")
    xy = x * y
    t = tf.placeholder(shape=None, dtype=tf.float32, name="t")
    z = tf.add(xy, t, name="z")


with tf.Session(graph=g) as sess:
    sess.run(tf.global_variables_initializer())
    result = sess.run(z, feed_dict={t: 1.0})
    print("result: ", result)

    # save the model
    saver = tf.train.Saver()
    saver.save(sess, save_path='./ckpt_models/add/add.ckpt')

输出结果为:

result:  5.0
生成pb文件

  上述模型生成pb文件的完整代码(single_ckpt_2_pb.py)如下:

# -*- coding: utf-8 -*-
import tensorflow as tf
from tensorflow.python import saved_model

export_path = "pb_models/add/1"

graph = tf.Graph()
saver = tf.train.import_meta_graph("./ckpt_models/add/add.ckpt.meta", graph=graph)
with tf.Session(graph=graph) as sess:
    saver.restore(sess, tf.train.latest_checkpoint("./ckpt_models/add"))
    saved_model.simple_save(session=sess,
                            export_dir=export_path,
                            inputs={"t": graph.get_operation_by_name('t').outputs[0]},
                            outputs={"z": graph.get_operation_by_name('z').outputs[0]})

运行上述代码,会在pb_models文件夹下生成文件如下:

$ tree pb_models/
pb_models/
├── add
│   └── 1
│       ├── saved_model.pb
│       └── variables
│           ├── variables.data-00000-of-00001
│           └── variables.index
模型部署

  接着我们使用Docker部署该模型,命令如下:

docker run -t --rm -p 8551:8501 -v "absolute_path_to_pb_models/pb_models/add:/models/add" -e MODEL_NAME=add tensorflow/serving:1.14.0

其中absolute_path_to_pb_models为pb_models所在的完整路径。输出结果如下:

2021-01-06 04:14:32.790794: I tensorflow_serving/model_servers/server.cc:82] Building single TensorFlow model file config:  model_name: add model_base_path: /models/add
2021-01-06 04:14:32.810015: I tensorflow_serving/model_servers/server_core.cc:462] Adding/updating models.
2021-01-06 04:14:32.810054: I tensorflow_serving/model_servers/server_core.cc:561]  (Re-)adding model: add
2021-01-06 04:14:32.910293: I tensorflow_serving/core/basic_manager.cc:739] Successfully reserved resources to load servable {name: add version: 1}
2021-01-06 04:14:32.910317: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: add version: 1}
2021-01-06 04:14:32.910327: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: add version: 1}
2021-01-06 04:14:32.910373: I external/org_tensorflow/tensorflow/contrib/session_bundle/bundle_shim.cc:363] Attempting to load native SavedModelBundle in bundle-shim from: /models/add/1
2021-01-06 04:14:32.910385: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /models/add/1
2021-01-06 04:14:32.910573: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2021-01-06 04:14:32.916524: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2021-01-06 04:14:32.956639: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:202] Restoring SavedModel bundle.
2021-01-06 04:14:33.075419: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:311] SavedModel load for tags { serve }; Status: success. Took 165028 microseconds.
2021-01-06 04:14:33.075456: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:103] No warmup data file found at /models/add/1/assets.extra/tf_serving_warmup_requests
2021-01-06 04:14:33.075528: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: add version: 1}
2021-01-06 04:14:33.084877: I tensorflow_serving/model_servers/server.cc:324] Running gRPC ModelServer at 0.0.0.0:8500 ...
[warn] getaddrinfo: address family for nodename not supported
2021-01-06 04:14:33.117161: I tensorflow_serving/model_servers/server.cc:344] Exporting HTTP/REST API at:localhost:8501 ...
[evhttp_server.cc : 239] RAW: Entering the event loop ...

出现上述结果即表示模型部署成功,其中8500为gRPC端口,8501为HTTP端口,这里我们只使用了HTTP端口。

模型预测

  查看模型部署状态的curl命令:

curl http://192.168.1.193:8551/v1/models/add

输出结果为:

{
 "model_version_status": [
  {
   "version": "1",
   "state": "AVAILABLE",
   "status": {
    "error_code": "OK",
    "error_message": ""
   }
  }
 ]
}

  查看模型部署元数据的curl命令:

curl http://192.168.1.193:8551/v1/models/add/metadata

输出结果为:

{
    "model_spec": {
        "name": "add",
        "signature_name": "",
        "version": "1"
    },
    "metadata": {
        "signature_def": {
            "signature_def": {
                "serving_default": {
                    "inputs": {
                        "t": {
                            "dtype": "DT_FLOAT",
                            "tensor_shape": {
                                "dim": [],
                                "unknown_rank": true
                            },
                            "name": "t:0"
                        }
                    },
                    "outputs": {
                        "z": {
                            "dtype": "DT_FLOAT",
                            "tensor_shape": {
                                "dim": [],
                                "unknown_rank": true
                            },
                            "name": "z:0"
                        }
                    },
                    "method_name": "tensorflow/serving/predict"
                }
            }
        }
    }
}

从中我们可以看到模型的输入与输出。
  使用curl命令进行模型预测:

curl --location --request POST 'http://192.168.1.193:8551/v1/models/add:predict' \
--header 'Content-Type: application/json' \
--data-raw '{
    "instances": [{"t": 2.0}]
}'

输出结果为:

{
    "predictions": [
        6.0
    ]
}

  如果使用Python进行模型预测,其代码(single_tf_serving.py)如下:

# -*- coding: utf-8 -*-
import requests

# 利用tensorflow/serving的HTTP接口请求进行预测
t = 2.0
tensor = {"instances": [{"t": t}]}

url = "http://192.168.1.193:8551/v1/models/add:predict"
req = requests.post(url, json=tensor)
if req.status_code == 200:
    z = req.json()['predictions'][0]
    print("model_add:", z)

多模型部署及预测

创建模型

  我们先使用Tensorflow创建多个模型:z=x*y+t,z=x*y-t,z=x*y*t,z=x*y/t,,其中x,y等于2.0,t为变量,z为输出结果。创建模型脚本(multi_model.py)的完整代码如下:

# -*- coding: utf-8 -*-
import tensorflow as tf

# add model
with tf.Graph().as_default() as g:
    x = tf.Variable(2.0, dtype=tf.float32, name="x")
    y = tf.Variable(2.0, dtype=tf.float32, name="y")
    xy = x * y
    t = tf.placeholder(shape=None, dtype=tf.float32, name="t")
    z = tf.add(xy, t, name="z")


with tf.Session(graph=g) as sess:
    sess.run(tf.global_variables_initializer())
    result = sess.run(z, feed_dict={t: 3.0})
    print("result: ", result)

    # save the model
    saver = tf.train.Saver()
    saver.save(sess, save_path='./ckpt_models/add/add.ckpt')

# substract model
with tf.Graph().as_default() as g:
    x = tf.Variable(2.0, dtype=tf.float32, name="x")
    y = tf.Variable(2.0, dtype=tf.float32, name="y")
    xy = x * y
    t = tf.placeholder(shape=None, dtype=tf.float32, name="t")
    z = tf.subtract(xy, t, name="z")


with tf.Session(graph=g) as sess:
    sess.run(tf.global_variables_initializer())
    result = sess.run(z, feed_dict={t: 3.0})
    print("result: ", result)

    # save the model
    saver = tf.train.Saver()
    saver.save(sess, save_path='./ckpt_models/subtract/subtract.ckpt')

# multipy
with tf.Graph().as_default() as g:
    x = tf.Variable(2.0, dtype=tf.float32, name="x")
    y = tf.Variable(2.0, dtype=tf.float32, name="y")
    xy = x * y
    t = tf.placeholder(shape=None, dtype=tf.float32, name="t")
    z = tf.multiply(xy, t, name="z")


with tf.Session(graph=g) as sess:
    sess.run(tf.global_variables_initializer())
    result = sess.run(z, feed_dict={t: 3.0})
    print("result: ", result)

    # save the model
    saver = tf.train.Saver()
    saver.save(sess, save_path='./ckpt_models/multiply/multiply.ckpt')

# divide model
with tf.Graph().as_default() as g:
    x = tf.Variable(2.0, dtype=tf.float32, name="x")
    y = tf.Variable(2.0, dtype=tf.float32, name="y")
    xy = x * y
    t = tf.placeholder(shape=None, dtype=tf.float32, name="t")
    z = tf.divide(xy, t, name="z")


with tf.Session(graph=g) as sess:
    sess.run(tf.global_variables_initializer())
    result = sess.run(z, feed_dict={t: 3.0})
    print("result: ", result)

    # save the model
    saver = tf.train.Saver()
    saver.save(sess, save_path='./ckpt_models/divide/divide.ckpt')

输出结果如下:

result:  7.0
result:  1.0
result:  12.0
result:  1.3333334
生成pb文件

  上述模型生成pb文件的完整代码(multi_ckpt_2_pb.py)如下:(注意,运行代码前需删除pb_models下的add目录,不然会报错。)

# -*- coding: utf-8 -*-
import tensorflow as tf
from tensorflow.python import saved_model


# change ckpt file to pb file
def model_export(model_name):
    export_path = "pb_models/{}/1".format(model_name)
    graph = tf.Graph()
    saver = tf.train.import_meta_graph("./ckpt_models/{}/{}.ckpt.meta".format(model_name, model_name),
                                       graph=graph)
    with tf.Session(graph=graph) as sess:
        saver.restore(sess, tf.train.latest_checkpoint("./ckpt_models/{}".format(model_name)))
        saved_model.simple_save(session=sess,
                                export_dir=export_path,
                                inputs={"t": graph.get_operation_by_name('t').outputs[0]},
                                outputs={"z": graph.get_operation_by_name('z').outputs[0]})


model_export("add")
model_export("subtract")
model_export("multiply")
model_export("divide")

运行上述代码,会在pb_models文件夹下生成文件如下:

$ tree pb_models/
pb_models/
├── add
│   └── 1
│       ├── saved_model.pb
│       └── variables
│           ├── variables.data-00000-of-00001
│           └── variables.index
├── divide
│   └── 1
│       ├── saved_model.pb
│       └── variables
│           ├── variables.data-00000-of-00001
│           └── variables.index
├── models.config
├── multiply
│   └── 1
│       ├── saved_model.pb
│       └── variables
│           ├── variables.data-00000-of-00001
│           └── variables.index
└── subtract
    └── 1
        ├── saved_model.pb
        └── variables
            ├── variables.data-00000-of-00001
            └── variables.index

12 directories, 13 files
模型部署

  tensorflow/serving支持多个模型同时部署,其部署命令如下:

docker run -t -d --rm -p 8551:8501 -v "absolute_path_to_pb_models/pb_models:/models" tensorflow/serving:1.14.0 --model_config_file=/models/models.config

其中absolute_path_to_pb_models为pb_models所在的完整路径。需要注明models.config文件,正是在这个文件中,我们配置了多个模型的部署信息,文件内容如下:

model_config_list {
  config {
    name: "add"
    base_path: "/models/add"
    model_platform: "tensorflow"
  },
  config {
    name: "subtract"
    base_path: "/models/subtract"
    model_platform: "tensorflow"
  },
  config {
    name: "multiply"
    base_path: "/models/multiply"
    model_platform: "tensorflow"
  },
  config {
    name: "divide"
    base_path: "/models/divide"
    model_platform: "tensorflow"
  }
}
模型预测

  使用Python脚本进行模型预测的代码(multi_tf_serving.py)如下:

# -*- coding: utf-8 -*-
import tensorflow as tf
from tensorflow.python import saved_model


# change ckpt file to pb file
def model_export(model_name):
    export_path = "pb_models/{}/1".format(model_name)
    graph = tf.Graph()
    saver = tf.train.import_meta_graph("./ckpt_models/{}/{}.ckpt.meta".format(model_name, model_name),
                                       graph=graph)
    with tf.Session(graph=graph) as sess:
        saver.restore(sess, tf.train.latest_checkpoint("./ckpt_models/{}".format(model_name)))
        saved_model.simple_save(session=sess,
                                export_dir=export_path,
                                inputs={"t": graph.get_operation_by_name('t').outputs[0]},
                                outputs={"z": graph.get_operation_by_name('z').outputs[0]})


model_export("add")
model_export("subtract")
model_export("multiply")
model_export("divide")

输出结果如下:

model_add:  8.0
model_subtract:  0.0
model_multiply:  16.0
model_divide:  1.0

模型版本控制及预测

创建模型

  tensorflow/serving还支持同一个模型的不同版本的部署。我们在这里创建一个add模型,其有三个版本,分别为:z=x*y+t,z=x*y+2t,z=x*y+3t,,其中x,y等于2.0,t为变量,z为输出结果。创建模型脚本(version_control_model.py)的完整代码如下:

# -*- coding: utf-8 -*-
import tensorflow as tf

# 第一个模型
g = tf.Graph()
with g.as_default() as g:
    x = tf.Variable(2.0, dtype=tf.float32, name="x")
    y = tf.Variable(2.0, dtype=tf.float32, name="y")
    xy = x * y
    t = tf.placeholder(shape=None, dtype=tf.float32, name="t")
    z = tf.add(xy, t, name="z")


with tf.Session(graph=g) as sess:
    sess.run(tf.global_variables_initializer())
    result = sess.run(z, feed_dict={t: 1.0})
    print("result: ", result)

    # save the model
    saver = tf.train.Saver()
    saver.save(sess, save_path='./ckpt_models/add/add.ckpt')


# 第二个模型
g = tf.Graph()
with g.as_default() as g:
    x = tf.Variable(2.0, dtype=tf.float32, name="x")
    y = tf.Variable(2.0, dtype=tf.float32, name="y")
    xy = x * y
    t = tf.placeholder(shape=None, dtype=tf.float32, name="t")
    z = tf.add(xy, 2*t, name="z")


with tf.Session(graph=g) as sess:
    sess.run(tf.global_variables_initializer())
    result = sess.run(z, feed_dict={t: 1.0})
    print("result: ", result)

    # save the model
    saver = tf.train.Saver()
    saver.save(sess, save_path='./ckpt_models/add/add2.ckpt')


# 第三个模型
g = tf.Graph()
with g.as_default() as g:
    x = tf.Variable(2.0, dtype=tf.float32, name="x")
    y = tf.Variable(2.0, dtype=tf.float32, name="y")
    xy = x * y
    t = tf.placeholder(shape=None, dtype=tf.float32, name="t")
    z = tf.add(xy, 3*t, name="z")


with tf.Session(graph=g) as sess:
    sess.run(tf.global_variables_initializer())
    result = sess.run(z, feed_dict={t: 1.0})
    print("result: ", result)

    # save the model
    saver = tf.train.Saver()
    saver.save(sess, save_path='./ckpt_models/add/add3.ckpt')

输出结果如下:

result:  5.0
result:  6.0
result:  7.0
生成pb文件

  上述模型生成pb文件的完整代码(version_control_ckpt_2_pb.py)如下:(注意,运行代码前需删除pb_models下的add目录,不然会报错。)

# -*- coding: utf-8 -*-
import tensorflow as tf
from tensorflow.python import saved_model


# change ckpt file to pb file
def model_export(model_version, tf_version):
    export_path = "pb_models/add/{}".format(tf_version)
    graph = tf.Graph()
    saver = tf.train.import_meta_graph("./ckpt_models/add/{}.ckpt.meta".format(model_version),
                                       graph=graph)
    with tf.Session(graph=graph) as sess:
        saver.restore(sess, tf.train.latest_checkpoint("./ckpt_models/add"))
        saved_model.simple_save(session=sess,
                                export_dir=export_path,
                                inputs={"t": graph.get_operation_by_name('t').outputs[0]},
                                outputs={"z": graph.get_operation_by_name('z').outputs[0]})


model_export("add", 1)
model_export("add2", 2)
model_export("add3", 3)

运行上述代码,会在pb_models文件夹下生成文件如下:

pb_models/
├── add
│   ├── 1
│   │   ├── saved_model.pb
│   │   └── variables
│   │       ├── variables.data-00000-of-00001
│   │       └── variables.index
│   ├── 2
│   │   ├── saved_model.pb
│   │   └── variables
│   │       ├── variables.data-00000-of-00001
│   │       └── variables.index
│   └── 3
│       ├── saved_model.pb
│       └── variables
│           ├── variables.data-00000-of-00001
│           └── variables.index
└── models.config

7 directories, 10 files
模型部署

  tensorflow/serving支持模型的多个版本同时部署,其部署命令如下:

docker run -t -d --rm -p 8551:8501 -v "absolute_path_to_pb_models/pb_models:/models" tensorflow/serving:1.14.0 --model_config_file=/models/models.config

其中absolute_path_to_pb_models为pb_models所在的完整路径。需要注明models.config文件,正是在这个文件中,我们配置了add模型的版本部署信息,文件内容如下:

model_config_list {
  config {
    name: "add"
    base_path: "/models/add"
    model_platform: "tensorflow"
    model_version_policy{
        all{
        }
    }
  }
}

注意model_version_policy字段,这里选择all,表示部署所有版本的模型。model_version_policy:模型版本策略,支持如下三种:

  • all:同时服务所有版本
  • latest:默认,默认值:1
  • specific:服务指定版本,可同时配置多个
模型预测

  使用Python脚本进行模型预测的代码(version_control_tf_serving.py)如下:

# -*- coding: utf-8 -*-
import requests

# 利用tensorflow/serving的HTTP接口请求进行预测
def model_predict(model_version):
    t = 4.0
    tensor = {"instances": [{"t": t}]}

    url = "http://192.168.1.193:8551/v1/models/add/versions/{}:predict".format(model_version)
    req = requests.post(url, json=tensor)
    if req.status_code == 200:
        z = req.json()['predictions'][0]
        print("model_version{}: ".format(model_version), z)


model_predict("1")
model_predict("2")
model_predict("3")

输出结果如下:

model_version1:  8.0
model_version2:  12.0
model_version3:  16.0

注意不同版本的模型请求url是不同的,规则如下:

/v1/models/<model name>/versions/<version number>

总结

  本项目已经上传至Github,网址为:https://github.com/percent4/tensorflow_serving_examples
  感谢大家的阅读,后续将介绍如何使用tensorflow/serving来部署BERT模型。

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值