Tensorflow 量化训练全过程

来自blog

Tensorflow 量化训练全过程

You can either train your quantized model by restroing a ever trained floating point model or from scratch. In any cases, you have to firstly create a quantization training graph.

tf.contrib.quantize.create_training_graph(quant_delay=DELAY_STEP)

The DELAY_STEP means number of steps after which weights and activations are quantized during training. Just put the above code after you create your normal training graph(exclude the optimization operation). If you use multi-gpu training, you have to create a new quantization graph on every gpu card. Just like the code as following:

with tf.variable_scope(tf.get_variable_scope()):
     for i in xrange(len(GPU_NUM_ID)):
          with tf.device('/gpu:%d' % GPU_NUM_ID[i]):
                with tf.name_scope('%s_%d' % ('cnn_mg', i)) as scope:
                            images, abels = load_batch_images()           
                            logits, out_data = net.inference(images, reuse=tf.AUTO_REUSE,  num_classes=LABEL_NUM)
                            with tf.variable_scope(tf.get_variable_scope(), reuse=tf.AUTO_REUSE):
                                tf.contrib.quantize.create_training_graph(quant_delay=DELAY_STEP)
                            loss = conpute_loss(labels, logits)
                            tf.get_variable_scope().reuse_variables()
                            grads = optimizer.compute_gradients(loss_total_sep)
                            tower_grads.append(grads)

One thing I have to mention is that the quantized aware training process is fake training. Fake training means that during the forward process, the training graph just simulate the integer multiply by using corrsponding floating point mulipy, The word ‘Corrosponding’ means that the simulated float point weights are the reversd quantization of the corresponding fixed integer point. So the training forward output may silightly different from the actual quantization computed result.

Save, Frozen, Convert and Test

Save

Next, you have to save your trained quantized model. However, to save your quantized model, you have to create a quantized evaluation graph by using the following code:

g = tf.get_default_graph()
tf.contrib.quantize.create_eval_graph(input_graph=g)

Then just writing the graph and save it.

with open('./your_quantized_graph.pb', 'w') as f:
       f.write(str(g.as_graph_def()))\

Frozen

To make your model more compact, you can froze your model. Frozen a model means that getting rid of useless operations and fusing redundant operations. To froze your graph, you can use the standard frozen tool.

bazel build tensorflow/python/tools:freeze_graph && \
bazel-bin/tensorflow/python/tools/freeze_graph \
--input_graph=some_graph_def.pb \
--input_checkpoint=model.ckpt-8361242 \
--output_graph=/tmp/frozen_graph.pb --output_node_names=softmax

Convert

The next step is to convert your frozen graph to tflite for future delopy.

path_to_frozen_graphdef_pb = './your_frozen_graph.pb'
input_shapes = {'validate_input/imgs':[1,320,320,3]}
(tf_verion>1.11)converter = tf.contrib.lite.TFLiteConverter.from_frozen_graph(path_to_frozen_graphdef_pb, ['validate_input/imgs'], ['output_node'])
(tf_version<=1.11)converter = tf.contrib.lite.TocoConverter.from_frozen_graph(path_to_frozen_graphdef_pb, ['validate_input/imgs'], ['output_node'])
converter.inference_type = tf.contrib.lite.constants.QUANTIZED_UINT8
converter.quantized_input_stats = {'validate_input/imgs':(0.,1.)}
converter.allow_custom_ops = True
converter.default_ranges_stats = (0,255)
converter.post_training_quantize = True
tflite_model = converter.convert()
open("sfnv2.tflite", "wb").write(tflite_model)

Test

Finally, your can test your converted tflite. By the following code, you can test your quantized model:

interpreter = tf.contrib.lite.Interpreter(model_path="your.tflite") 
interpreter.allocate_tensors() 
input_details = interpreter.get_input_details() 
output_details = interpreter.get_output_details() 
interpreter.set_tensor(input_details[0]['index'], batch_validate_img)
interpreter.invoke()
score = interpreter.get_tensor(output_details[0]['index'])
score = score[0][0]
zero_point = xxx
scale = xxx
reverse_socre = scale  * (score - zero_point)

One thing to mention is that the final score you get is a fixted point integer value. You have to convert the fixed point integer value to the corresponing float value. In order to do that, you have to check the corresponding zero point and scale in the corresponding output layer and then descaling the original output value.

  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值