Framework | Tensorrt Inference Server

最新推荐文章于 2024-07-22 22:15:05 发布

明灵暗尘

最新推荐文章于 2024-07-22 22:15:05 发布

阅读量599

点赞数 1

分类专栏： Framework

本文链接：https://blog.csdn.net/silence_iz/article/details/104358839

版权

Framework 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

Catalogue

Download
- 1.1 download the trtis docker image from domestic images source
- 1.2 download the trtis docker image from nvidia
Usage
Example
Reference

Download

1.1 download the trtis docker image from domestic images source

more detail can be seen in the blog.¹

docker pull registry.cn-beijing.aliyuncs.com/cloudhjc/tensorrtserver:server19.08
# or
docker pull registry.cn-hangzhou.aliyuncs.com/bostenai/tensorrtserver:19.04-py3

1.2 download the trtis docker image from nvidia

more detail can be seen in this blog.²

docker pull nvcr.io/nvidia/tensorrtserver:18.09-py3

Usage

1.1 quick look

nvidia-docker run -it --rm nvcr.io/nvidia/tensorrtserver:x.x-py3

1.2 deploy

nvidia-docker run --rm -p8000:8000 -p8001:8001 -v/path/to/examples/models:/models nvcr.io/nvidia/tensorrtserver:x.x-py3 trtserver --model-store=/models

1.3 inspect

host ip:port/api/status, Warming you can inspect the states of models, such as ready_state, and if ready_state==MODEL_UNAVAILABLE, trtis will not recognize this model.

Example

1.1 build a yolov3 model

Warming
- if you get any error when running yolov3_to_onnx.py, try to reduce onnx version to 1.4.1. Like this:
```
pip uninstall onnx
pip install onnx==1.4.1
```
- there are two ways to build yolov3 tensorrt engine, one is build it in the docker, another is in the /usr/local/TensorRT-5.1.5.0/sample/python/yolov3_to_onnx, the latter will be easier.

1.1.1 build in a docker image

follow the guide to build a yolov3 tensorrt engine
start a tensorrt container

docker run \
       -v $PWD/trt:/workspace/trt \
       --name trt \
       -ti nvcr.io/nvidia/tensorrt:19.10-py2 /bin/bash

build yolov3 model

# inside container trt
export TRT_PATH=/usr/src/tensorrt
cd $TRT_PATH/samples/python/yolov3_onnx/;

pip install wget
pip install onnx==1.5.0

# will automatic download the model and convert into onnx
python yolov3_to_onnx.py;

# build trtexec engine 
cd $TRT_PATH/samples/trtexec; 
make; cd ../../; 
./bin/trtexec --onnx=$TRT_PATH/samples/python/yolov3_onnx/yolov3.onnx --saveEngine=$TRT_PATH/model.plan 
# Average over 10 runs is 30.8623 ms (host walltime is 31.4395 ms, 99% percentile time is 31.9949)

copy model

# at your host
mkdir -p $model_path/yolov3_608_trt/1
docker cp trt:/usr/src/tensorrt/model.plan $model_path/yolov3_608_trt/1

1.1.2 build in TensorRT 5.1.5.0 (python3)

move to TensorRT-path/samples/python/yolov3_onnx/
pip install wget
pip install onnx==1.4.1
python yolov3_to_onnx.py

# build trtexec engine 
cd $TRT_PATH/samples/trtexec; 
make; cd ../../; 
./bin/trtexec --onnx=$TRT_PATH/samples/python/yolov3_onnx/yolov3.onnx --saveEngine=$TRT_PATH/model.plan 
# Average over 10 runs is 30.8623 ms (host walltime is 31.4395 ms, 99% percentile time is 31.9949)

1.1.3 write a configuration file

# $model_path/yolov3_608_trt/config.pbtxt
name: "yolov3_608_trt"
platform: "tensorrt_plan"
max_batch_size: 1
dynamic_batching {
  preferred_batch_size: [1]
  max_queue_delay_microseconds: 100
}
input [
  {
    name: "000_net"
    data_type: TYPE_FP32
    format: FORMAT_NCHW
    dims: [ 3, 608, 608 ]
  }
]
output [
  {
    name: "082_convolutional"
    data_type: TYPE_FP32
    dims: [ 255, 19, 19 ]
  },
  {
    name: "094_convolutional"
    data_type: TYPE_FP32
    dims: [ 255, 38, 38 ]
  },
  {
    name: "106_convolutional"
    data_type: TYPE_FP32
    dims: [ 255, 76, 76 ]
  }
]
instance_group [
  {
    count:2
    kind: KIND_GPU
  }
]

1.2 build a python client

reference

# download client python library
# https://github.com/NVIDIA/tensorrt-inference-server/releases
# for example:
# wget https://github.com/NVIDIA/tensorrt-inference-server/releases/download/v1.7.0/v1.7.0_ubuntu1604.clients.tar.gz;
# tar xvzf v1.7.0_ubuntu1604.clients.tar.gz;

apt-get install curl libcurl4-openssl-dev
apt-get install python python-pip
pip install --user --upgrade python/tensorrtserver*.whl numpy pillow
python image_client.py -m yolov3_608_trt ~/mayday.jpg