文献笔记（7）(2017ISSCC 14.3)

最新推荐文章于 2020-05-24 15:51:11 发布

tiaozhanzhe1900

最新推荐文章于 2020-05-24 15:51:11 发布

阅读量298

点赞数

分类专栏： NPU

本文链接：https://blog.csdn.net/tiaozhanzhe1900/article/details/83275688

版权

NPU 专栏收录该内容

76 篇文章 17 订阅

订阅专栏

文章目录

1 英文缩写
2 overall architecture
3 five-stage DNN accelerator
4 zero operand:
5 error-tolerant operation

题目：A 28nm SoC with a 1.2GHz 568nJ/Prediction Sparse Deep-Neural-Network Engine with >0.1 Timing Error Rate Tolerance for IoT Applications
时间：2017
会议：ISSCC(14.3)
研究机构：哈佛大学
链接：https://blog.csdn.net/tiaozhanzhe1900/article/details/83275688
参考链接1：https://blog.csdn.net/xbinworld/article/details/55118537
参考链接2：https://mp.weixin.qq.com/s?__biz=MzI3MDQ2MjA3OA==&mid=2247483726&idx=1&sn=46da35379241adb013498a67d15faab4&chksm=ead1fc5fdda67549b2b4d9dd09e1cd33061ece3143f57627203420d143b8ac0cd937e193c142&scene=21#wechat_redirect

1 英文缩写

PVTA: process/voltage/temperature/aging
FC: fully-connected
DNN: deep-neural-network
IPBUF: input buffer
ReLU: rectified(矫正的) linear unit
MxV: matrix-vector
TC: two’s compliment
SM: sign-magnitude
RZFF: Razor flip-flops

2 overall architecture

This paper presents a 28nm SoC with a programmable FC-DNN accelerator design :

elide(删去) unnecessary computation to exploit data sparsity(稀疏)
using sign-magnitude number format for weights and datapath computation
improved circuit-level timing violation tolerance in datapath logic via timeborrowing
Razor timing violation detection to reduce energy

3 five-stage DNN accelerator

The DNN Engine is a 5-stage SIMD-style programmable sparse matrix-vector (MxV) machine for processing arbitrary(任意的) DNNs.
在这里插入图片描述
SBUF: double to allow simultaneous reads from the previous layer and writes to the current layer

4 zero operand:

之前的工作：通过clock-gates functional units to save power，但是在pipeline中会有bubble
他们：XBUF写回的时候就动态的消除zero operand
甚至跳过一些小的非零的数

5 error-tolerant operation

为了实现error-tolerant的操作，这个设计中在两个时序关键路径，W-MEM load and MAC unit的路径终点, 增加了Razor flip-flops (RZFFs)。双模式RZFF中的MUX可以选择支持 datapath FF功能或者带time borrowing的latch.
time borrowing:
在这里插入图片描述
将触发器F2改成锁存器L2，利用高电位L2是透明的性质，通往锁存器的路径可以从后续的路径借用时间，而不需要非要在时钟上升沿之前准备好数据，成为time borrowing

sign-magnitude: reduce switching activity in the MSBs and thus bit-flips
在这里插入图片描述