2014:DianNao a small-footprint high-throughput accelerator for ubiquitous machine-learning

  • 我是在这个网址上找到这篇文章的!

https://dl.acm.org/doi/10.1145/2654822.2541967

  • 属于

Abstract

  • ML
  • pervasive in a broad range of domains,
  • in a broad range of systems (embedded to data centers)

  • At the same time,
    • a small set of ml algorithms (especially Convolutional and Deep Neural Networks, i.e., CNNs and DNNs)
  • are proving to be sota across many applications.

  • As architectures evolve towards
    • heterogeneous multi-cores
    • composed of a mix of cores and accelerators,
  • a ml accelerator can achieve
    • the rare combination of efficiency (due to the small number of target algorithms)
    • and broad application scope.

下一段

  • Until now, most ml accelerator designs
    • focused on
    • efficiently implementing the computational part of the algorithms.
  • However,recent sota CNNs and DNNs
    • are characterized by their large size.

这么大咋办?

  • design an accelerator
    • for large-scale CNNs and DNNs,
  • emphasis on the impact of memory on accelerator design, performance and energy.

第三段

  • possible to design an accelerator with
    • a high throughput,
    • capable of performing 452 GOP/s (key NN operations such as synaptic weight multiplications and neurons outputs additions)
    • in a small footprint of 3.02 m m 2 mm^2 mm2 and 485 mW;
  • compared to a 128-bit 2GHz SIMD processor,
    • 117.87x faster,
    • reduce the total energy by 21.08x.

  • The accelerator characteristics are obtained
    • after layout at 65nm.
  • Such a high throughput in a small footprint
    • can
    • open up the usage of sota ml algorithms
    • in a broad set of systems and
    • for a broad set of applications.

1. Introduction

  • architectures evolve towards
    • heterogeneous multi-cores
    • composed of a mix of cores and accelerators,
  • designing accelerators
    • which realize the best possible tradeoff
    • between flexibility and efficiency
    • is becoming a prominent issue.

第二段

  • The first question
    • for which category of applications
  • one should primarily design accelerators ?

  • Together with the architecture towards
    • accelerators,
  • a second simultaneous and significant trend
    • in high-performance and embedded applications is developing:

  • many of the emerging high-performance
    • and embedded applications,
    • from image/video/audio recognition to automatic translation, business analytics, and all forms of robotics rely on ml techniques.

  • This trend even starts to percolate in our community
    • where it turns out that
  • half of the benchmarks of PARSEC [2],
    • a suite partly introduced to highlight the emergence of new types of applications,
    • can be implemented using machine-learning algorithms [4].

PARSEC里的一半都可以用ml实现!

  • a third and equally remarkable trend in ml
    • where a small number of techniques,
    • based on nn (especially CNN [27] and DNN
      [16]),
      proved in the past few years to be state-of-the-art across a broad range of applications [25].

  • a unique opportunity to
    • design accelerators
    • which can realize the best of both worlds:
  • significant application scope together with
  • high performance and efficiency
  • due to the limited number of target algorithms.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

fgh431

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值