一.镜像拉取测试
docker pull tensorflow/serving
git clone https://github.com/tensorflow/serving
TESTDATA="$(pwd)/serving/tensorflow_serving/servables/tensorflow/testdata"
docker run -t --rm -p 8501:8501 \
-v "$TESTDATA/saved_model_half_plus_two_cpu:/models/half_plus_two" \
-e MODEL_NAME=half_plus_two \
tensorflow/serving &
curl -d '{"instances": [1.0, 2.0, 5.0]}' \
-X POST http://localhost:8501/v1/models/half_plus_two:predict
二.创建自己的tf-serve容器
同时开放http端口和grpc端口,18501为http端口,18500为grpc端口。tf-serve支持热更新,直接把不同版本的saved-mode.pb和variables文件放进source文件夹即可以直接更新,很方便。
生成saved-mode.pb和variables文件可以参考我的另外一篇文章。tensorflow各种模型文件的生成
docker run -p 18501:8501 -p 18500:8500 \
--mount type=bind,\
source=/home/models/,\
target=/models/detect \
-e MODEL_NAME=porn -t tensorflow/serving &
可以看模型版本,可以查看到模型的多个版本信息
curl -v http://192.168.42.89:18501/v1/models/porn
查看模型的基本信息
curl http://192.168.42.89:18501/v1/models/porn/metadata
三.端口请求
http请求
import requests, os
import concurrent
import numpy as np
from PIL import Image
import cv2
api= "http://192.168.42.89:18501/v1/models/porn:predict"
#预处理
def cvload(image):
pass
def predict(images):
images = npload(images)
data = {"signature_name": "predict_images","instances": images.tolist()}
res = requests.post(api,json=data).json()
return res['predictions'][0][1]
if __name__ == '__main__':
path = '1.jpg'
predict(path)
SERVER_URL = 'http://localhost:8501/v1/models/porn/versions/100001:predict'
grpc请求
import grpc, time
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc
import requests, os
import concurrent
import numpy as np
from PIL import Image
import cv2
import tensorflow
if tensorflow.__version__ > '2.0':
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
tf.disable_eager_execution()
else:
import tensorflow as tf
#预处理
def cvload(image):
pass
class ResNet50:
def __init__(self):
server = '10.192.58.168:18500'
channel = grpc.insecure_channel(server)
self.stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
self.request = predict_pb2.PredictRequest()
self.request.model_spec.name = 'porn'
self.request.model_spec.signature_name = 'predict_images'
def predict(self, images):
images = cvload(images)
self.request.inputs['input'].CopyFrom(tf.make_tensor_proto(images, shape=images.shape))
result = self.stub.Predict(self.request, 5.0)
res = result.outputs['conf'].float_val
return res
if __name__ == '__main__':
model = ResNet50()
model.predict('1.jpg')
个人体验:
grpc比http快很多,但是grpc不支持多进程,仅支持多线程。http支持多进程,如果是高并发的话建议用http。
参考文献:
https://github.com/tensorflow/serving