Paper Reading: A 288μW Programmable Deep-Learning Processor with 270KB On-Chip Weight Storage

IC_菌

已于 2023-07-01 22:06:10 修改

阅读量66

点赞数

文章标签：人工智能硬件架构卷积神经网络边缘计算

于 2023-07-01 22:04:45 首次发布

本文链接：https://blog.csdn.net/m0_52357437/article/details/131494990

版权

该论文介绍了一种在2017年ISSCC上发布的低功耗可编程深度学习加速器（DLA）。设计重点在于减少数据移动开销，通过在存储器中放置四个处理元素，并采用非均匀内存层次结构，平衡频繁访问数据的小型低功耗内存银行和大容量高密度内存银行之间的权衡。芯片的性能和实际照片在文中展示。

摘要由CSDN通过智能技术生成

1. Introduction

This paper is published in ISSCC in 2017. Recently there has been increased interest in deep learning for mobile IoT to enable intelligence at the edge. Therefore, low power is a critical design constraint. The researchers introduce a low-power, programmable deep learning accelerator (DLA).

2. Innovation points

Top-level diagram of proposed DLA is shown below.

2.1 Four processing elements (PEs) are located amidst the weight storage memory

This accelerator is almost entirely on-chip storage, minimizing data movement overhead. But I think we need to take cost into account

2.2 Adopt a non-uniform memory hierarchy

The non-uniform memory hierarchy provides a trade-off between small, low-power memory banks for frequently used data and larger, high density banks with higher power for the large amount of infrequently accessed data.

3. Summary

The performance of the chip and die photo are shown below.

IC_菌

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
Paper Reading: A 288μW Programmable Deep-Learning Processor with 270KB On-Chip Weight Storage

This paper is published in ISSCC in 2017. Recently there has been increased interest in deep learning for mobile IoT to enable intelligence at the edge. Therefore, low power is a critical design constraint. The researchers introduce a low-power, programmab
复制链接

扫一扫