ONNX格式介绍
ONNX--开放神经网络交换格式(Open Neural Network Exchange)作为框架共用的一种模型交换格式,使用protobuf二进制格式来序列化模型,可以提供更好的传输性能,我们可能会在某一任务中将Pytorch或者TensorFlow模型转化为ONNX模型(ONNX模型一般用于中间部署阶段),然后再拿转化后的ONNX模型进而转化为我们使用不同框架部署需要的类型,ONNX相当于一个翻译的作用。
典型的几个线路:
- Pytorch -> ONNX -> TensorRT
- Pytorch -> ONNX -> TVM
- TF – onnx – ncnn
- Pytorch -> ONNX -> tensorflow
什么是Protobuf
ONNX既然是一个文件格式,那么我们就需要一些规则去读取它,或者写入它,ONNX采用的是protobuf这个序列化数据结构协议去存储神经网络权重信息。
Protobuf是个什么东西,如果大家使用过caffe或者caffe2,那么想必可能对Protobuf比较熟悉,因为caffe的模型采用的存储数据结构协议也是Protobuf。
这里简单介绍一些protobuf吧,Protobuf是一种平台无关、语言无关、可扩展且轻便高效的序列化数据结构的协议,可以用于网络通信和数据存储。我们可以通过protobuf自己设计一种数据结构的协议,然后使用各种语言去读取或者写入,通常我们采用的语言就是C++。
Pytorch2ONNX
将一个模型导出到ONNX格式。该exporter会运行一次你的模型,以便于记录模型的执行轨迹,并将其导出;目前,exporter还不支持动态模型(例如,RNNs)
import torch
import torchvision
dummy_input = torch.randn(10, 3, 224, 224, device='cuda')
model = torchvision.models.alexnet(pretrained=True).cuda()
# Providing input and output names sets the display names for values
# within the model's graph. Setting these does not change the semantics
# of the graph; it is only for readability.
#
# The inputs to the network consist of the flat list of inputs (i.e.
# the values you would pass to the forward() method) followed by the
# flat list of parameters. You can partially specify names, i.e. provide
# a list here shorter than the number of inputs to the model, and we will
# only set that subset of names, starting from the beginning.
input_names = [ "actual_input_1" ] + [ "learned_%d" % i for i in range(16) ]
output_names = [ "output1" ]
torch.onnx.export(model, dummy_input, "alexnet.onnx", verbose=True, input_names=input_names, output_names=output_names)
ONNX混合跟踪导出
import torch
# Trace-based only
class LoopModel(torch.nn.Module):
def forward(self, x, y):
for i in range(y):
x = x + i
return x
model = LoopModel()
dummy_input = torch.ones(2, 3, dtype=torch.long)
loop_count = torch.tensor(5, dtype=torch.long)
torch.onnx.export(model, (dummy_input, loop_count), 'loop.onnx', verbose=True)
caffe2运行导出的模型
import caffe2.python.onnx.backend as backend
import numpy as np
rep = backend.prepare(model, device="CUDA:0") # or "CPU"
# For the Caffe2 backend:
# rep.predict_net is the Caffe2 protobuf for the network
# rep.workspace is the Caffe2 workspace for the network
# (see the class caffe2.python.onnx.backend.Workspace)
outputs = rep.run(np.random.randn(10, 3, 224, 224).astype(np.float32))
# To run networks with more than one input, pass a tuple
# rather than a single numpy ndarray.
print(outputs[0])
ONNX Runtime运行导出的模型
# ...continuing from above
import onnxruntime as ort
ort_session = ort.InferenceSession('alexnet.onnx')
outputs = ort_session.run(None, {'actual_input_1': np.random.randn(10, 3, 224, 224).astype(np.float32)})
print(outputs[0])
Tensorflow运行导出的模型
import sys
import onnx
from onnx_tf.backend import prepare
# tensorflow >=2.0
# 1: Thanks:github:https://github.com/onnx/onnx-tensorflow
def transform_to_tensorflow(onnx_input_path, pb_output_path):
onnx_model = onnx.load(onnx_input_path) # load onnx model
tf_exp = prepare(onnx_model) # prepare tf representation
tf_exp.export_graph(pb_output_path) # export the model