Reducing the Size of Your Core ML App

11 篇文章 0 订阅

Article

Reducing the Size of Your Core ML App

Reduce the storage used by the Core ML model inside your app bundle.

Overview

Bundling your machine learning model in your app is the easiest way to get started with Core ML. As models get more advanced, they can become large and take up significant storage space. For a neural-network based model, consider reducing its footprint by using a lower precision representation for its weight parameters. If your model isn’t a neural network that can use half precision or you need to further reduce your app’s size, add functionality to download and compile your models on the user’s device instead of bundling the models with your app.

Convert to a Half-Precision Model

The Core ML Tools provide a conversion function to convert a neural network model’s floating point weights from full-precision into half-precision values (reducing the number of bits used in the representation from 32 down to 16). This type of conversion can significantly reduce a network’s size, most of which often comes from the connection weights within the network.

Listing 1 

Converting a model to lower precision with Core ML Tools

# Load a model, lower its precision, and then save the smaller model.
model_spec = coremltools.utils.load_spec('./exampleModel.mlmodel')
model_fp16_spec = coremltools.utils.convert_neural_network_spec_weights_to_fp16(model_spec)
coremltools.utils.save_spec(model_fp16_spec, 'exampleModelFP16.mlmodel')

You can only convert neural networks or pipeline models embedding neural networks to half precision. All full-precision weight parameters in a model must be converted to half-precision. 

Using half-precision floating point values not only reduces the accuracy of the floating point values, but the range of possible values is also reduced. Before deploying this option to your users, confirm that the behavior of your model is not degraded. 

Models that are converted to half precision require these OS versions or later: iOS 11.2, macOS 10.13.2, tvOS 11.2, or watchOS 4.2.

Convert to a Lower Precision Model

Core ML Tools 2.0 introduced new utilities to reduce the precision of a model down to 8, 4, 2, or 1 bit. The tools include functions to gauge the differences in behavior between the original model and the lower precision model. For more information about using these utilities, see the Core ML Tools Neural Network Quantization documentation. 

Lower precision models require these OS versions or later: iOS 12, macOS 10.14, tvOS 12, or watchOS 5.

Download and Compile a Model

Another option to reduce the size of your app is to have the app download the model onto the user’s device and compile it in the background. For example, if users use only a subset of the models your app supports, you don’t need to bundle all the possible models with your app. Instead, the models can be downloaded later based on user behavior. See Downloading and Compiling a Model on the User's Device.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值