17mbp18mbp屏幕_具有基于幅度的修剪算法mbp或mp的修剪tacotron2和fastspeech2模型

17mbp18mbp屏幕

Model Pruning is a class of AI models optimizations inspired by a human natural process that happens in the brain between early childhood and adulthood.

Model Pruning是一类AI模型优化,它受到人类自然过程的启发,该过程发生在儿童早期和成年之间。

Pruning striving to diminish the number of parameters and operations required in the computation by excluding some components inside the neuron network. (connections, neurons, channel … )

修剪努力通过排除神经元网络内部的某些组件来减少计算中所需的参数和操作的数量。 (连接,神经元,通道……)

The main purpose is to reduce the model size and inference time.Pruning can be done before training, during training, and after training. We are supposed to apply it on the pre-trained model.Enclosed a figure which illustrates the different components of pruning framework :

主要目的是减少模型大小和推断时间。可以在训练之前,训练期间和训练之后进行修剪。 我们应该将其应用到预训练模型上,并附上一张图,说明修剪框架的不同组成部分:

Image for post
Pruning in Deep Learning Model 深度学习模型中的修剪

To start pruning our models, we should follow 3 main steps :

要开始修剪我们的模型,我们应该遵循3个主要步骤:

i. Chose the level of pruning: The most used pruning approaches are either made on neuron connections (called weight pruning ) or nodes (called neuron pruning ).

选择修剪水平:最常用的修剪方法是在神经元连接(称为权重修剪)或结点(称为神经元修剪)上进行的。

ii. Chose the pruning criteria, we will use the absolute value criteria.

ii 。 选择修剪标准,我们将使用绝对值标准。

iii. Chose the pruning algorithm, the based one is Magnitude Pruner.

iii 。 选择修剪算法,基于的是Magnitude Pruner。

Enclosed a Git link which contains several pruning algorithms papers, classified by release date

包含一个Git链接,其中包含一些修剪算法论文,按发布日期分类

The bellow image illustrates the pruning level that we will work on for the next step, we will focus exclusively on pruning connections.

下面的图像说明了下一步的修剪级别,我们将专门关注修剪连接。

Image for post
Pruned Level example shows removed connections and neurons
修剪级别示例显示已删除的连接和神经元

权重修剪(Weights Pruning)

Weight pruning (pruning connections) in general means eliminating some values in the weight tensors. We set the neural network parameters values to zero to exclude what we consider additional connections between the layers of a neural network.

重量修剪(修剪连接)通常意味着消除重量张量中的某些值。 我们将神经网络参数值设置为零以排除我们认为的神经网络各层之间的其他连接。

基于幅度的修剪 (Magnitude Based Pruning)

Magnitude based pruning consists to set “individual weights” in the weight matrix to zero. This corresponds to deleting connections between neurons. To achieve “sparsity” of K% Unknown character we rank the “individual weights” in weight matrix according to their magnitude (absolute value) |wi,j|, and then set to zero the smallest K% Unknown character.

基于幅度的修剪包括将权重矩阵中的“单个权重”设置为零。 这对应于删除神经元之间的连接。 为了获得K%未知字符的“稀疏性”,我们根据权重矩阵的大小(绝对值)| wi,j |在权重矩阵中对“各个权重”进行排名,然后将最小的K%未知字符设置为零。

NB:

注意:

  • Individual weights: its each |wi,j| of the Weight matrix.

    单个权重:每个| wi,j | 权重矩阵

  • Sparsity: This term used to describe the percentage of cells in a matrix or database that are not filled with data or are equal to zeros. ( On the other hand, a Dense array or database contains mostly non-zeros.)

    稀疏度:该术语用于描述矩阵或数据库中未填充数据或等于零的单元格的百分比。 (另一方面,密集数组或数据库主要包含非零值。)

  • The k% we will use is in this range:

    我们将使用的k%在以下范围内:

k in ['0.1%','0.4%', '0.8%', '1%', '2%', '6%','8%','9%','99%']

k in ['0.1%','0.4%', '0.8%', '1%', '2%', '6%','8%','9%','99%']

  • Note that as we increase the percentage of pruning (increase the sparsity k%) the model performance will degrade.

    请注意,随着我们增加修剪百分比(增加稀疏度k%),模型性能将下降。
  • We will do pruning onpthfile (Pytorch implementation, but it can be done also with Tensorflow).

    我们将对pth文件进行修剪(Pytorch实现,但也可以使用Tensorflow进行修剪)。

You can find all the related code in my Google Colab notebook:

您可以在我的Google Colab笔记本中找到所有相关代码:

Tacotron2参考: (Tacotron2 reference:)

Fastspeech2参考: (Fastspeech2 reference:)

结论: (Conclusion:)

After pruning the tacotron2 and fastspeech2 models with Magnitude based pruning algorithm MBP (without using any function in Tensorflow or Pytorch API), we load the new checkpoints in PyTorch with gzip format.

在使用基于Magnitude的修剪算法MBP修剪tacotron2和fastspeech2模型之后(不使用Tensorflow或Pytorch API中的任何功能),我们以gzip格式在PyTorch中加载新的检查点。

We did use Pytorch version TTS models, but it still valid for Tensorflow implementation.

我们确实使用了Pytorch版本的TTS模型,但对于Tensorflow实施仍然有效。

After, we try the inference based on k% sparsity of new checkpoints. We did find for tacotron2, the suitable checkpoint is around 8% of sparsity which reduce around 18% of the model (from 108mb to 87mb), and for fastspeech2 for 99% of sparsity, it reduces around 11% of model size (from 196 to 174mb).

之后,我们尝试基于新检查点的k%稀疏性进行推断。 我们确实发现,对于tacotron2,合适的检查点是稀疏度的8%左右,这会减少约18%的模型(从108mb减少到87mb),而对于fastspeech2来说,稀疏度的99%,它会减少约11%的模型大小(从196开始)至174mb)。

Note that we exclude bias matrix from pruning in RNN and CNN part of tacotron2, and we do the pruning only on attention layer of Fastspeech2 without bias matrix.

请注意,我们从tacotron2的RNN和CNN部分的修剪中排除了偏倚矩阵,并且仅对Fastspeech2的关注层进行了修剪,而没有偏倚矩阵。

透视: (Perspective:)

While Fastspeech2 is based on Transformers neuron network, the best solution its to go for head pruning.

虽然Fastspeech2基于Transformers神经元网络,但最好的解决方案是修剪头部。

After we arrive at pruning the models, and if we assume that we are working with Tensorflow implementation, we can go for a model quantization using Tflite.

修剪完模型后,如果假设正在使用Tensorflow实现,则可以使用Tflite进行模型量化。

翻译自: https://medium.com/@ahmed.barbouche1/prune-tacotron2-and-fastspeech2-models-with-magnitude-based-pruning-algorithm-mbp-or-mp-bfcde390d419

17mbp18mbp屏幕

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值