2014：DianNao a small-footprint high-throughput accelerator for ubiquitous machine-learning

最新推荐文章于 2022-03-03 14:45:08 发布

fgh431

最新推荐文章于 2022-03-03 14:45:08 发布

阅读量474

点赞数

分类专栏： pr

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/zhoutianzi12/article/details/110120087

版权

pr 专栏收录该内容

27 篇文章 1 订阅

订阅专栏

文章目录

Abstract
- 下一段
- 第三段
1. Introduction
- 第二段

我是在这个网址上找到这篇文章的！

https://dl.acm.org/doi/10.1145/2654822.2541967

属于

Abstract

ML
pervasive in a broad range of domains,
in a broad range of systems (embedded to data centers)

At the same time,
- a small set of ml algorithms (especially Convolutional and Deep Neural Networks, i.e., CNNs and DNNs)
are proving to be sota across many applications.

As architectures evolve towards
- heterogeneous multi-cores
- composed of a mix of cores and accelerators,
a ml accelerator can achieve
- the rare combination of efficiency (due to the small number of target algorithms)
- and broad application scope.

下一段

Until now, most ml accelerator designs
- focused on
- efficiently implementing the computational part of the algorithms.
However,recent sota CNNs and DNNs
- are characterized by their large size.

这么大咋办？

design an accelerator
- for large-scale CNNs and DNNs,
emphasis on the impact of memory on accelerator design, performance and energy.

第三段

possible to design an accelerator with
- a high throughput,
- capable of performing 452 GOP/s (key NN operations such as synaptic weight multiplications and neurons outputs additions)
- in a small footprint of 3.02 $mm^2$ and 485 mW;
compared to a 128-bit 2GHz SIMD processor,
- 117.87x faster,
- reduce the total energy by 21.08x.

The accelerator characteristics are obtained
- after layout at 65nm.
Such a high throughput in a small footprint
- can
- open up the usage of sota ml algorithms
- in a broad set of systems and
- for a broad set of applications.

1. Introduction

architectures evolve towards
- heterogeneous multi-cores
- composed of a mix of cores and accelerators,
designing accelerators
- which realize the best possible tradeoff
- between flexibility and efficiency
- is becoming a prominent issue.

第二段

The first question
- for which category of applications
one should primarily design accelerators ?

Together with the architecture towards
- accelerators,
a second simultaneous and significant trend
- in high-performance and embedded applications is developing:

many of the emerging high-performance
- and embedded applications,
- from image/video/audio recognition to automatic translation, business analytics, and all forms of robotics rely on ml techniques.

This trend even starts to percolate in our community
- where it turns out that
half of the benchmarks of PARSEC [2],
- a suite partly introduced to highlight the emergence of new types of applications,
- can be implemented using machine-learning algorithms [4].

PARSEC里的一半都可以用ml实现！

a third and equally remarkable trend in ml
- where a small number of techniques,
- based on nn (especially CNN [27] and DNN
  [16]),
  proved in the past few years to be state-of-the-art across a broad range of applications [25].

a unique opportunity to
- design accelerators
- which can realize the best of both worlds:
significant application scope together with
high performance and efficiency
due to the limited number of target algorithms.

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

打赏作者

fgh431 你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20

扫码支付：¥1

获取中

扫码支付

您的余额不足，请更换扫码支付或充值

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。