tensorrt 安装和对tensorflow模型做推理,附python3.6解决方案

tensorrt 安装和对tensorflow模型做推理,附python3.6解决方案

环境

  • ubuntu 18.04
  • tensorflow-gpu 1.12
  • cuda 9.0
  • cudnn 7.1
  • python 3.6
  • python包一大堆,差啥pip啥

安装 tensorrt4.0

从官网下载tensorrt4.0 (cuda9.0 ubuntu16压缩包版本)
解压到/usr/local 目录下,并将解压后的文件目录配置到LD_LIBRARY_PATH 环境变量
进入到tensorrt解压目录中,cd python/ 进行python api的安装,sudo pip install一下需要的包
进入tensorrt的其他几个目录中,用相似的方式安装其他几个工具的python api

在安装python api时,官网推荐的python3版本是3.5,如果和我一样是使用3.6的话直接sudo pip install会报错(…找遍国内外博客都没找到解决方案,改名字也是治标不治本,安装上无法使用)

  1. 第一步,把whl安装包名字中的3.5改成3.6(备份一下),安装
  2. 第二步,安装sublime
  3. 进入库安装目录,找到.so文件,右键属性,记下文件大小
  4. sudo vim 打开.so文件使用 /python3.5查找,替换成python3.6,用N n查找上一项下一项
  5. 全部替换完后用subl 命令打开,删除文件最后的a0(还是0a,忘了,反正删除最后两个字符),保存

不想自己改的话,可以试试用我改好的文件下载地址,替换掉就行,trt版本是4.0.1.6
进入安装目录下的simple/tf_to_trt目录(我的是/usr/local/lib/python3.6/dist-packages/tensorrt/examples/tf_to_trt)
执行tf_to_trt.py,不会报错就ok了.

tf_to_trt

在nvidia给的例子中lenet5已经完成了训练过程,但是tf_to_trt模块又重新训练了一次再进行推理,我们想要的使用方式当然不是这样.
在lenet5模块中完成了神经网络模型的训练和uff格式的保存,而tf_to_trt模块只需要读取保存的模型做推理就行了.

#
# Copyright 1993-2018 NVIDIA Corporation.  All rights reserved.
#
# NOTICE TO LICENSEE:
#
# This source code and/or documentation ("Licensed Deliverables") are
# subject to NVIDIA intellectual property rights under U.S. and
# international Copyright laws.
#
# These Licensed Deliverables contained herein is PROPRIETARY and
# CONFIDENTIAL to NVIDIA and is being provided under the terms and
# conditions of a form of NVIDIA software license agreement by and
# between NVIDIA and Licensee ("License Agreement") or electronically
# accepted by Licensee.  Notwithstanding any terms or conditions to
# the contrary in the License Agreement, reproduction or disclosure
# of the Licensed Deliverables to any third party without the express
# written consent of NVIDIA is prohibited.
#
# NOTWITHSTANDING ANY TERMS OR CONDITIONS TO THE CONTRARY IN THE
# LICENSE AGREEMENT, NVIDIA MAKES NO REPRESENTATION ABOUT THE
# SUITABILITY OF THESE LICENSED DELIVERABLES FOR ANY PURPOSE.  IT IS
# PROVIDED "AS IS" WITHOUT EXPRESS OR IMPLIED WARRANTY OF ANY KIND.
# NVIDIA DISCLAIMS ALL WARRANTIES WITH REGARD TO THESE LICENSED
# DELIVERABLES, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY,
# NONINFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE.
# NOTWITHSTANDING ANY TERMS OR CONDITIONS TO THE CONTRARY IN THE
# LICENSE AGREEMENT, IN NO EVENT SHALL NVIDIA BE LIABLE FOR ANY
# SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, OR ANY
# DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
# WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
# ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE
# OF THESE LICENSED DELIVERABLES.
#
# U.S. Government End Users.  These Licensed Deliverables are a
# "commercial item" as that term is defined at 48 C.F.R. 2.101 (OCT
# 1995), consisting of "commercial computer software" and "commercial
# computer software documentation" as such terms are used in 48
# C.F.R. 12.212 (SEPT 1995) and is provided to the U.S. Government
# only as a commercial end item.  Consistent with 48 C.F.R.12.212 and
# 48 C.F.R. 227.7202-1 through 227.7202-4 (JUNE 1995), all
# U.S. Government End Users acquire the Licensed Deliverables with
# only those rights set forth herein.
#
# Any use of the Licensed Deliverables in individual and commercial
# software must include, in the user documentation and internal
# comments to the code, the above Disclaimer and U.S. Government End
# Users Notice.
#

import os
import numpy as np

import pycuda.driver as cuda
import pycuda.autoinit

'初始化cuda'

import uff

import tensorrt as trt
from tensorrt.parsers import uffparser

import lenet5

G_LOGGER = trt.infer.ConsoleLogger(trt.infer.LogSeverity.INFO)
'''
这里为了连接trt的api需要连接trt的输出
'''

MAX_WORKSPACE = 1 << 20

INPUT_W = 28
INPUT_H = 28
OUTPUT_SIZE = 10

MAX_BATCHSIZE = 1
ITERATIONS = 10
'''
定义超参数:
为trt开辟的显存
输入的形状和输出到屏幕的结果数量
'''

def infer(context, input_img, batch_size):
    '''
    推理函数
    :param context: 推理上下文
    :param input_img: 输入
    :param batch_size: 输入的批大小
    :return: 模型输出
    '''

    engine = context.get_engine()
    '从模型上下文获取推理引擎'

    dims = engine.get_binding_dimensions(1).to_DimsCHW()
    '转换输入维度次序'

    elt_count = dims.C() * dims.H() * dims.W() * batch_size
    '计算输入大小'

    input_img = input_img.astype(np.float32)
    '将输入转换成float32'
    output = cuda.pagelocked_empty(elt_count, dtype=np.float32)
    '为输出分配内存'

    d_input = cuda.mem_alloc(batch_size * input_img.size * input_img.dtype.itemsize)
    d_output = cuda.mem_alloc(batch_size * output.size * output.dtype.itemsize)
    '分配显卡内存'

    bindings = [int(d_input), int(d_output)]
    '绑定输入输出'

    stream = cuda.Stream()
    '初始化cuda操作队列'

    cuda.memcpy_htod_async(d_input, input_img, stream)
    '内存拷贝,从主机内存到设备内存'
    context.enqueue(batch_size, bindings, stream.handle, None)
    '执行运算'
    cuda.memcpy_dtoh_async(output, d_output, stream)
    '内存拷贝,从设备内存到主机内存'

    return output

def readUffToEngine():
    parser = uffparser.create_uff_parser()
    '创建模型解析器'
    parser.register_input("Placeholder", (1, 28, 28), 0)
    parser.register_output("fc2/Relu")
    '为解析器定义输入和输出'
    engine = trt.utils.uff_file_to_trt_engine(
        G_LOGGER,
        'trained_lenet5.uff',
        parser,
        MAX_BATCHSIZE,
        MAX_WORKSPACE
    )
    '从文件解析uff模型并使用trt优化模型,输出会显示优化步骤'
    parser.destroy()
    '解析器可以丢掉了'
    return engine

def trainToEngine():
    tf_model = lenet5.learn()
    '训练模型并接收'

    uff_model = uff.from_tensorflow(tf_model, ["fc2/Relu"])
    parser = uffparser.create_uff_parser()
    parser.register_input("Placeholder", (1, 28, 28), 0)
    parser.register_output("fc2/Relu")
    '这和上面一样'

    engine = trt.utils.uff_to_trt_engine(
        G_LOGGER,
        uff_model,
        parser,
        MAX_BATCHSIZE,
        MAX_WORKSPACE
    )
    parser.destroy()
    return engine

def main():
    engine = readUffToEngine()
    '使用文件创建engine'
    # engine=trainToEngine()
    '训练模型创建engine'

    context = engine.create_execution_context()
    '获取引擎执行上下文'

    print("\n| TEST CASE | PREDICTION |")
    for i in range(ITERATIONS):
        img, label = lenet5.get_testcase()
        img = img[0]
        label = label[0]
        out = infer(context, img, 1)
        print("|-----------|------------|")
        print("|     " + str(label) + "     |      " + str(np.argmax(out)) + "     |")

if __name__ == "__main__":
    main()

输出:

TEST CASEPREDICTION
88
77
00
33
22
66
22
66
55
88
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值