文献笔记(3)(2018ISSCC 13.4)


文献摘自A 9.02mW CNN-Stereo-Based Real-Time 3D Hand-Gesture Recognition Processor for Smart Mobile Devices

1 英文缩写

HGR: hand-gesture recognition手势检测
HMD: head-mounted displays头戴式设备
ToF: time-of-flight
PE: processing element
NNS: nearest-neighbor searching
PIM: processing-in-memory
CSE: CNN-stereo(立体) engine
ICP-PSO: iterative-closest-point/particle-swarm optimization-based(迭代最近点、粒子群优化)
IPE: ICP-PSO engine
FWD: forwarding
BWD: backwarding

2 overall architecture

In this paper, we describe an accurate, low power (<10mW), and real-time 3D HGR processor for smart mobile devices.

feature:

  • a piplined CNN processing element with a shift MAC operation
  • triple ping-pong buffers with workload balancing
  • nearest-neighbor searching (NNS) processing-in-memory (PIM) for high energy efficiency

CNN-stereo engine(CSE)

  • two line-streaming CNN cores
  • 4 locally distributed memories
  • 1 matching core

the CNN core

  • 1 pipelined CNN PE
  • a local DMA
  • a forwarding/backwarding unit

ICP-PSE engine(IPE)

  • a NNS unit with 16-way parallel NNS PIMs
  • a hand-tracking unit

在这里插入图片描述

3 pipelined CNN PE architecture

The shift MAC operation with a 3×3 filter in consists of three stages

  • shifting feature maps and filters
  • element-wise multiplication
  • partia-sum accumulation
    The line-streaming CNN operation is accelerated by the 7-stage pipelined CNN PE that processes 48 MACs per cycle with 96% core utilization
    在这里插入图片描述

4 triple ping-pong memories

The hardware utilizes triple ping-pong memories to store feature maps, where each memory is accessed simultaneously to feed pipeline inputs, write back pipeline outputs, and to access an external interface, respectively.
在这里插入图片描述

为什么是3?
Instead of storing the entire feature maps on the chip, the line-streaming processing with only 3-to-5 lines of feature maps reduces 90.1% of required data that must be fetched from/to off-chip。

如何 balance workload?
The FWD/BWD units keep CNN core workloads identical throughout CNN processing and exchange feature-map boundary data with one another when local feature maps are fetched.

5 nearest-neighbor searching (NNS) processing-in-memory (PIM)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
ISSCC2018_short_course ISSCC2018论文合集 ISSCC2018-01_Visuals.pdf ISSCC2018-02_Visuals.pdf ISSCC2018-03_Visuals.pdf ISSCC2018-04_Visuals.pdf ISSCC2018-05_Visuals.pdf ISSCC2018-06_Visuals.pdf ISSCC2018-07_Visuals.pdf ISSCC2018-08_Visuals.pdf ISSCC2018-09_Visuals.pdf ISSCC2018-10_Visuals.pdf ISSCC2018-11_Visuals.pdf ISSCC2018-12_Visuals.pdf ISSCC2018-13_Visuals.pdf ISSCC2018-14_Visuals.pdf ISSCC2018-15_Visuals.pdf ISSCC2018-16_Visuals.pdf ISSCC2018-17_Visuals.pdf ISSCC2018-18_Visuals.pdf ISSCC2018-19_Visuals.pdf ISSCC2018-20_Visuals.pdf ISSCC2018-21_Visuals.pdf ISSCC2018-22_Visuals.pdf ISSCC2018-23_Visuals.pdf ISSCC2018-24_Visuals.pdf ISSCC2018-25_Visuals.pdf ISSCC2018-26_Visuals.pdf ISSCC2018-27_Visuals.pdf ISSCC2018-28_Visuals.pdf ISSCC2018-29_Visuals.pdf ISSCC2018-30_Visuals.pdf ISSCC2018-31_Visuals.pdf ISSCC2018-DemoSessions.pdf ISSCC2018AdvanceProgram.pdf ISSCC2018-01_Digest.pdf ISSCC2018-02_Digest.pdf ISSCC2018-03_Digest.pdf ISSCC2018-04_Digest.pdf ISSCC2018-05_Digest.pdf ISSCC2018-06_Digest.pdf ISSCC2018-07_Digest.pdf ISSCC2018-08_Digest.pdf ISSCC2018-09_Digest.pdf ISSCC2018-10_Digest.pdf ISSCC2018-11_Digest.pdf ISSCC2018-12_Digest.pdf ISSCC2018-13_Digest.pdf ISSCC2018-14_Digest.pdf ISSCC2018-15_Digest.pdf ISSCC2018-16_Digest.pdf ISSCC2018-17_Digest.pdf ISSCC2018-18_Digest.pdf ISSCC2018-19_Digest.pdf ISSCC2018-20_Digest.pdf ISSCC2018-21_Digest.pdf ISSCC2018-22_Digest.pdf ISSCC2018-23_Digest.pdf ISSCC2018-24_Digest.pdf ISSCC2018-25_Digest.pdf ISSCC2018-26_Digest.pdf ISSCC2018-27_Digest.pdf ISSCC2018-28_Digest.pdf ISSCC2018-29_Digest.pdf ISSCC2018-30_Digest.pdf ISSCC2018-31_Digest.pdf
Overall, the four Plenary talks address a wide variety of topical issues: In today’s highly competitive global environment, business pressures are driving the semiconductor industry to avoid risk and to develop new products following a model of incrementalism and consolidation. Vince Roche, CEO of Analog Devices argues that such pressures must be resisted if the semiconductor industry is to continue to carry the mantle of technological leadership and maintain a bright future. He concludes that application challenges such as the spread of pervasive ubiquitous sensing, rapid advances in artificial intelligence, heterogeneous integration, and the continued impact of digitization on virtually every industry on earth, will require more, not less, semiconductor innovation. Recent developments in cognitive and neuro-sciences are bringing new understanding as to how the human brain processes information through distributed and connected pathways. Barbara de Salvo, Deputy Director for Science and Long-Term Research for CEA-Leti, discusses how these new discoveries are driving new models of semiconductor computing, which have the potential to enable step-change improvements in processing efficiency. She outlines a research strategy encompassing algorithms, circuits, and components, that aims to develop brain-inspired technologies to meet the needs of 21st-century applications. The automotive industry is undergoing a once-in-a-lifetime transformation, where technologies once seen to be in the realm of science fiction, such as electric vehicles and autonomous cars, are rapidly becoming a reality. Yukihiro Kato, Executive Director of Denso explains how the semiconductor industry has a vital role to play in solving many of the technical challenges (ranging from efficient energy conversion and smart sensing, to real-time communication and decision making) that must be overcome to enable the reality of this future ‘mobility-enhanced society’. Our modern society has been revolutionised by the development of powerful and ubiquitous computing devices. David Patterson of Google and UC Berkeley reviews 50 years of innovation in computer architectures, from mainframe computers in the 1960s to the dominant RISC architecture of recent times. With the end of Moore’s Law slowing the pace of improvements due to scaling, Patterson describes how this slowdown is actually rejuvenating innovation in computer architectures, as future performance improvements cannot come from scaling alone. With this line-up of speakers, the Plenary Session covers a wide range of topics, ranging from semiconductor industry challenges and brain-inspired computing, to future automotive transportation and the history and future of computer architectures. We hope that you will find the presentations of our distinguished speakers informative and inspiring.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值