常见问题:
1. weight中的scale Tensor 提示是empty,原因缺少init
graph.initializer.append(x_scale) # x_scale为node中的tensor
2. 插入node节点之后进行model check的时候提示node没有这个属性例如:
"onnx.onnx_cpp2py_export.checker.ValidationError: Unrecognized attribute: axis for operator QuantizeLinear"
原因:当前onnx的optset版本太低或者太高了,应该先到onnx查看当前想插入的axis在optset哪个版本中存在,然后将onnx转换为你需要的版本后再插入对应的属性,转换方法:参考链接
import onnx
from onnx import version_converter, helper
# Preprocessing: load the model to be converted.
model_path = "path/to/the/model.onnx"
original_model = onnx.load(model_path)
print(f"The model before conversion:\n{original_model}")
# A full list of supported adapters can be found here:
# https://github.com/onnx/onnx/blob/main/onnx/version_converter.py#L21
# Apply the version conversion on the original model
converted_model = version_converter.convert_version(original_model, <int target_version>)
print(f"The model after conversion:\n{converted_model}")
3. 模型运行时提示
"oid onnxruntime::PrepareForQDQ(const TensorShape&, const Tensor&, const Tensor*, int64_t, int64_t, int64_t&, int64_t&, int64_t&) scale.Shape().NumDimensions() == 1 && scale.Shape()[0] == broadcast_dim was false. For per axis quantization, scale must be 1D tensor with size 256"
原因: Q/DQ中默认的axis=1,需要调整axis为你的Channle维度
4.Q/DQ 自动添加脚本
def insert_weight_qdq(model, quant_type, graph, i, node_index, input_name, dim):
quant_output_name = quant_type +"_"+ node_index +"_weight"
tensor = np.random.rand(dim)
x_scale = helper.make_tensor(
name="x_scale"+quant_output_name,
data_type=onnx.TensorProto.FLOAT,
dims=[dim],
vals=tensor
)
x_zero_point = helper.make_tensor(
name="x_zero_point"+quant_output_name,
data_type=onnx.TensorProto.INT8,
dims=[dim],
vals=tensor.astype(np.int8)
)
graph.initializer.append(x_scale)
graph.initializer.append(x_zero_point)
#attr = onnx.helper.make_attribute("axis", 0)
quant_input_node = helper.make_node(
quant_type,
inputs=[input_name, "x_scale"+quant_output_name, "x_zero_point"+quant_output_name],
outputs=[quant_output_name],
name=quant_output_name+"_output",
)
#quant_input_node.attribute.insert(0,attr)
graph.node.insert(i, quant_input_node)
return quant_output_name
attention:
https://chromewebstore.google.com/detail/redium/aapiedkipcbeplicbbicchmdmpinhjdl?pli=1