论文笔记:A Simple and Effective Pruning Approach for Large Language Models

iclr 2024 reviewer 评分 5668

1 intro

  • 大模型网络剪枝的paper
    • 在努力保持性能的同时,舍弃网络权重的一个子集
  • 现有方法
    • 要么需要重新训练
      • 这对于十亿级别的LLMs来说往往不现实
    • 要么需要解决依赖于二阶信息的权重重建问题
      • 这同样可能带来高昂的计算成本
  • ——>引入了一种新颖、简单且有效的剪枝方法,名为Wanda (Pruning by Weights and activations)
    • 在每个输出的基础上,剪枝那些乘以相应输入激活后幅度最小的权重
    • 无需重新训练或权重更新,剪枝后的LLM可以即刻使用

2 方法

2.1 motivation

  • 考虑一个带有两个输入及其对应权重的神经元:y = w1x1 + w2x2,其中|w1| ≤ |w2|。
    • 现在假设目标是选择一个权重进行移除,同时使输出变化最小。
    • 标准的幅度剪枝方法总是会移除权重w1
      • 如果输入特征x1和x2的幅度相似,这可能是一个好策略。
      • 然而,最近在LLMs中观察到,两个输入特征的规模可能差异很大。例如,可能|x1| ≫ |x2|,结果是|w1x1| ≫ |w2x2|。
      • 在这种情况下,我们应该移除权重w2,因为这种移除明显对神经元输出y的影响小于移除权重w1。

  • 这个动机示例与最简单的线性层一起暗示了幅度剪枝的一个主要限制
    • 它没有考虑输入激活,输入激活在决定神经元输出时可能与权重幅度同样重要。
    • 对于剪枝LLMs,这一点尤其关键,考虑到在其中发现的突出大幅度特征
    • ——>提出了一种专门为LLMs设计的剪枝指标,以处理此类限制,同时也保持了幅度剪枝的简单性

2.2 剪枝指标

2.3 和现有方法的对比

3 实验

3.1 效果比较

3.2 速度比较

3.3 finetune 剪枝后的LLM可以接近不剪枝的LLM

3.4 校准数据(X)的影响

As an AI language model, I am not able to write a script but I can provide you with an overview of what you need to do to prune YOLOv5 by Python. Pruning is a technique used to reduce the size of deep neural networks by removing the weights with the lowest importance. The idea is to keep the most important weights while discarding the rest, reducing the number of parameters and making the model more efficient. Here are the steps you can follow to prune YOLOv5 by Python: 1. Load the pre-trained YOLOv5 model using PyTorch. 2. Define a pruning method, such as magnitude pruning, that will determine which weights to keep and which to discard. 3. Define a pruning scheduler that will determine when to prune the model, for example, after every epoch or after a certain number of iterations. 4. Train the YOLOv5 model on your dataset. 5. After each pruning iteration, retrain the model to fine-tune the remaining weights and improve its accuracy. 6. Repeat steps 3-5 until the desired level of pruning is achieved. To implement these steps, you can use PyTorch's pruning module, which provides functions for different pruning methods and schedulers. You can also refer to the PyTorch documentation and examples for more information on how to implement pruning in your YOLOv5 model. Note that pruning can significantly reduce the size of your model, but it may also affect its accuracy. Therefore, it's important to carefully select the pruning method and schedule and evaluate the performance of the pruned model on your validation set.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

UQI-LIUWJ

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值