tensorflow pb中的FakeQuantWithMinMaxVarss计算scale和zeropoint

最新推荐文章于 2023-04-18 16:57:17 发布

天才da熊猫

最新推荐文章于 2023-04-18 16:57:17 发布

阅读量2.1k

点赞数

本文链接：https://blog.csdn.net/azheng_wen/article/details/90294418

版权

最近在解析tensorflow pb文件时，发现FakeQuantWithMinMaxVarss给出的min-max无法计算得到tflite的scale和zeropoint；用的计算方式是这样的：(8bit quantization)

scale = (max - min) / 255

zeropoint = - min / scale

然而再使用toco转换成tflite，发现权重scale、zeropoint计算错误。

最后发现tensorflow的计算方法（tensorflow/lite/kernels/internal/quantization_util.h）：

template <typename T>
QuantizationParams ChooseQuantizationParams(double rmin, double rmax,
                                            bool narrow_range) {
  const T qmin = std::numeric_limits<T>::min() + (narrow_range ? 1 : 0);
  const T qmax = std::numeric_limits<T>::max();
  const double qmin_double = qmin;
  const double qmax_double = qmax;
  // 0 should always be a representable value. Let's assume that the initial
  // min,max range contains 0.
  TFLITE_CHECK_LE(rmin, 0.);
  TFLITE_CHECK_GE(rmax, 0.);
  if (rmin == rmax) {
    // Special case where the min,max range is a point. Should be {0}.
    TFLITE_CHECK_EQ(rmin, 0.);
    TFLITE_CHECK_EQ(rmax, 0.);
    QuantizationParams quantization_params;
    quantization_params.zero_point = 0;
    quantization_params.scale = 0.;
    return quantization_params;
  }

  // General case.
  //
  // First determine the scale.
  const double scale = (rmax - rmin) / (qmax_double - qmin_double);

  // Zero-point computation.
  // First the initial floating-point computation. The zero-point can be
  // determined from solving an affine equation for any known pair
  // (real value, corresponding quantized value).
  // We know two such pairs: (rmin, qmin) and (rmax, qmax).
  // The arithmetic error on the zero point computed from either pair
  // will be roughly machine_epsilon * (sum of absolute values of terms)
  // so we want to use the variant that adds the smaller terms.
  const double zero_point_from_min = qmin_double - rmin / scale;
  const double zero_point_from_max = qmax_double - rmax / scale;
  const double zero_point_from_min_error =
      std::abs(qmin_double) + std::abs(rmin / scale);
  const double zero_point_from_max_error =
      std::abs(qmax_double) + std::abs(rmax / scale);

  const double zero_point_double =
      zero_point_from_min_error < zero_point_from_max_error
          ? zero_point_from_min
          : zero_point_from_max;

  // Now we need to nudge the zero point to be an integer
  // (our zero points are integer, and this is motivated by the requirement
  // to be able to represent the real value "0" exactly as a quantized value,
  // which is required in multiple places, for example in Im2col with SAME
  // padding).
  T nudged_zero_point = 0;
  if (zero_point_double < qmin_double) {
    nudged_zero_point = qmin;
  } else if (zero_point_double > qmax_double) {
    nudged_zero_point = qmax;
  } else {
    nudged_zero_point = static_cast<T>(round(zero_point_double));
  }
  // The zero point should always be in the range of quantized value,
  // [qmin, qmax].
  TFLITE_CHECK_GE(nudged_zero_point, qmin);
  TFLITE_CHECK_LE(nudged_zero_point, qmax);

  // Finally, store the result nudged quantization params.
  QuantizationParams quantization_params;
  quantization_params.zero_point = nudged_zero_point;
  quantization_params.scale = scale;
  return quantization_params;
}

而权重的narrow_range为true

这样计算就能得到正确的结果了。

最后提一下，toco转换得到输入结点的scale/zeropoint是由toco指令的--std_values和--mean_values决定。

天才da熊猫

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
tensorflow pb中的FakeQuantWithMinMaxVarss计算scale和zeropoint

最近在解析tensorflow pb文件时，发现FakeQuantWithMinMaxVarss给出的min-max无法计算得到tflite的scale和zeropoint；用的计算方式是这样的：(8bit quantization)然而再使用toco转换成tflite，发现权重scale、zeropoint计算错误。最后发现tensorflow的计算方法（tensorflow/l...
复制链接

扫一扫