Attention Is All You Need But You Don’t Need All Of It For Inference of Large Language Models

本文是LLM系列文章,针对《Attention Is All You Need But You Don’t Need All Of It For Inference of Large Language Models》的翻译。

您只需要注意力,但不需要全部注意力来推理大型语言模型

摘要

近几个月来,对 LLM 的推理需求猛增,由于注意力层的二次输入长度复杂性,为低延迟的模型提供服务仍然具有挑战性。在这项工作中,我们研究了在推理时丢弃 MLP 和注意力层对 Llama-v2 模型性能的影响。我们发现,丢弃 dreeper attention 层只会略微降低性能,但在丢弃整个层的同时可以带来最佳的加速。例如,在 13B Llama2 模型中删除 33% 的注意力层会导致平均性能比 OpenLLM 基准测试下降 1.8%。我们还观察到,跳过除后一层之外的层会降低跳过更多层的性能,但跳过注意力层除外。

1 引言

2 方法

3 结果

4 相关工作

5 结论

我们研究了从 7B 和 13B Llama2 模型中删除最后一层的影响。我们观察到,无论是否包含最后一层,丢弃注意力子层都比丢弃 MLP 子层导致性能下降低得多,同时还会导致更好的推理速度。例如,删除

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
All of Statistics is a comprehensive textbook on statistics written by Larry Wasserman, a professor of statistics at Carnegie Mellon University. The book provides a thorough introduction to statistical concepts and methods, including probability theory, statistical inference, regression analysis, and hypothesis testing. It is intended for students and researchers in a variety of fields, including mathematics, engineering, computer science, and the natural and social sciences. The book is divided into six parts: 1. Probability: This section covers basic concepts in probability theory, including random variables, probability distributions, conditional probability, and Bayes' rule. 2. Statistical Inference: This section covers the principles of statistical inference, including point estimation, confidence intervals, and hypothesis testing. 3. Linear Regression: This section covers linear regression models, including simple linear regression, multiple regression, and logistic regression. 4. Nonparametric Methods: This section covers nonparametric methods, including rank-based tests and density estimation. 5. Bayesian Methods: This section covers Bayesian methods, including Bayes' theorem, Bayesian inference, and hierarchical models. 6. Advanced Topics: This section covers advanced topics in statistics, including high-dimensional data analysis, time series analysis, and causal inference. Throughout the book, Wasserman emphasizes the importance of understanding the underlying concepts and principles of statistics, rather than just memorizing formulas and procedures. He also provides numerous examples and exercises to help readers develop their skills in statistical analysis. Overall, All of Statistics is a highly-regarded textbook that provides a comprehensive introduction to statistical theory and methods. It is suitable for undergraduate and graduate students, as well as researchers and practitioners in a range of fields.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

UnknownBody

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值