onnxruntime中传入特定数据类型，比如fp16,int8

最新推荐文章于 2025-03-23 11:40:31 发布

znsoft

最新推荐文章于 2025-03-23 11:40:31 发布

阅读量6.1k

点赞数 4

分类专栏：图神经网络文章标签：深度学习

本文链接：https://blog.csdn.net/znsoft/article/details/114583048

版权

图神经网络专栏收录该内容

12 篇文章

订阅专栏

typedef enum ONNXTensorElementDataType {
  ONNX_TENSOR_ELEMENT_DATA_TYPE_UNDEFINED,
  ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT,   // maps to c type float
  ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT8,   // maps to c type uint8_t
  ONNX_TENSOR_ELEMENT_DATA_TYPE_INT8,    // maps to c type int8_t
  ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT16,  // maps to c type uint16_t
  ONNX_TENSOR_ELEMENT_DATA_TYPE_INT16,   // maps to c type int16_t
  ONNX_TENSOR_ELEMENT_DATA_TYPE_INT32,   // maps to c type int32_t
  ONNX_TENSOR_ELEMENT_DATA_TYPE_INT64,   // maps to c type int64_t
  ONNX_TENSOR_ELEMENT_DATA_TYPE_STRING,  // maps to c++ type std::string
  ONNX_TENSOR_ELEMENT_DATA_TYPE_BOOL,
  ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16,
  ONNX_TENSOR_ELEMENT_DATA_TYPE_DOUBLE,      // maps to c type double
  ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT32,      // maps to c type uint32_t
  ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT64,      // maps to c type uint64_t
  ONNX_TENSOR_ELEMENT_DATA_TYPE_COMPLEX64,   // complex with float32 real and imaginary components
  ONNX_TENSOR_ELEMENT_DATA_TYPE_COMPLEX128,  // complex with float64 real and imaginary components
  ONNX_TENSOR_ELEMENT_DATA_TYPE_BFLOAT16     // Non-IEEE floating-point format based on IEEE754 single-precision
} ONNXTensorElementDataType;

需要自行转换数据类型后再放到vector中传入。

注意createTensor使用的是非模板版本

std::vector<uint8_t> mask_tensor_values;
for(int i = 0; i < mask_tensor_size; i++){
	mask_tensor_values.push_back((uint8_t)(true));
}
auto mask_memory_info = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);
Ort::Value mask_tensor = Ort::Value::CreateTensor<bool>(mask_memory_info, reinterpret_cast<bool *>(mask_tensor_values.data()),mask_tensor_size, mask_node_dims.data(),  ONNX_TENSOR_ELEMENT_DATA_TYPE_INT8);

fp16 需要特殊处理，注意转入的长度：

    std::vector<uint16_t>  inputTensorValueFp16;

    //std::clock_t c_start = std::clock();
    for (auto fp32 : inputTensorValues)
    {
        inputTensorValueFp16.push_back(float32_to_float16(fp32));
    }

    //std::clock_t c_end = std::clock();

    //auto timevalue = c_end - c_start;
    //printf("time: %d\n", timevalue);
    // Note we are passing bytes count in this api, not number of elements -> sizeof(values)
    auto inputTensor = Ort::Value::CreateTensor(memoryInfo, inputTensorValueFp16.data(), inputTensorValueFp16.size()*sizeof(uint16_t), inputShape.data(),   inputShape.size(), ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16);

后面扯点闲篇：

fp16的模型是用winmltools转换的，具体看我的另一篇blog： https://blog.csdn.net/znsoft/article/details/114538684

在你自己的python环境中:

pip install winmltools

目前只在ubuntu20.04下安装通过，其它平台，比如ubuntu18.04,windows 10 /anaconda下都安装失败。。