Tensorflow Model Optimization (TF模型优化)

一、TF模型优化概述

1. 为什么要优化?

  • Reducing latency and cost for inference for both cloud and edge devices (e.g. mobile, IoT).
  • Deploying models on edge devices with restrictions on processing, memory and/or power-consumption.
  • Reducing payload size for over-the-air model updates.
  • Enabling execution on hardware restricted-to or optimized-for fixed-point operations.
  • Optimizing models for special purpose hardware accelerators.

2. 主要方法

1)Quantization

Quantized models are those where we represent the models with lower precision, such as 8-bit integers as opposed to 32-bit float. Lower precision is a requirement to leverage certain hardware.

2) Sparsity and pruning

Sparse models are those where connections in between operators (i.e. neural network layers) have been pruned, introducing zeros to the parameter tensors.

3) Clustering

Clustered models are those where the original model’s parameters are replaced with a smaller number of unique values.

4) Collaborative optimization

This enables you to benefit from combining several model compression techniques and simultaneously achieve improved accuracy through quantization aware training.

二、Weight Pruning

1. Trim insignificant weights (修剪无关紧要的权重)

权重修剪在训练过程中逐渐将模型权重归零,以实现模型稀疏性。 稀疏模型更容易压缩,我们可以在推理过程中跳过零值以改善延迟。

三、Quantization

1. Quantization aware training

2. Post-training quantization

四、Weight Clustering

Clustering, or weight sharing, reduces the number of unique weight values in a model, leading to benefits for deployment. It first groups the weights of each layer into N clusters, then shares the cluster’s centroid value for all the weights belonging to the cluster.

This technique brings improvements via model compression. Future framework support can unlock memory footprint improvements that can make a crucial difference for deploying deep learning models on embedded systems with limited resources.

五、Collaborative Optimization

Collaborative optimization is an overarching process that encompasses various techniques to produce a model that, at deployment, exhibits the best balance of target characteristics such as inference speed, model size and accuracy.

The idea of collaborative optimizations is to build on individual techniques by applying them one after another to achieve the accumulated optimization effect. Various combinations of the following optimizations are possible:

![在这里插入图片描述](https://img-blog.csdnimg.cn/img_convert/9aef1330e0e87d32226bb378c1339bfc.png![在这里插入图片描述](https://img-blog.csdnimg.cn/img_convert/3a1d2e9eb112b835b8f850ab8680636d.png在这里插入图片描述在这里插入图片描述- 使用方法:方法

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值