文献笔记（4）(2018ISSCC 13.3)

最新推荐文章于 2024-04-23 13:57:39 发布

tiaozhanzhe1900

最新推荐文章于 2024-04-23 13:57:39 发布

阅读量703

点赞数

分类专栏： NPU

本文链接：https://blog.csdn.net/tiaozhanzhe1900/article/details/83141183

版权

本文介绍了UNPU，一个支持1b到16b权重位精度的统一深度神经网络加速器。它由4个DNN核心、一个聚合核心、1D SIMD核心和RISC控制器组成，采用LUT为基础的位串行处理元素，减少了离片内存访问，并实现了CNN、RNN/FC层的高效映射。

摘要由CSDN通过智能技术生成

文章目录

1 英文缩写
2 overall architecture
1 缩写&引用
2 abstract & introduction
3 overall硬件结构
4 workload mapping on unified DNN core
- 4.1 RNN/FC
- 4.2 CL
5 LUT-based bit-serial processing elements

题目：UNPU: A 50.6TOPS/W Unified Deep Neural Network Accelerator with 1b-to-16b Fully-Variable Weight Bit-Precision
时间：2018
会议：ISSCC(13.3)
研究机构：KAIST韩国科学技术院

1 英文缩写

CL: convolutional layer
FCL: fully-connected layer
RL: recurrent layer
PE: processing element
UNPU: unified neural processing unit
IF: input feature
LBPE: lookup-table-based bit-serial PE
AFL: aligned feature loader
OF: output feature

2 overall architecture

In this paper, we present a unified neural processing unit (UNPU) supporting CLs, RLs, and FCLs with fully-variable weight bit-precision from 1b to 16b.

reuse of input feature
the lookup-table-based bit serial PE is implement for energy-optimal DNN operations with variable-weight bit-precisions from 1b to 1