存算一体化(Processing in Memory, PIM)入门

存算一体化通过将计算能力嵌入存储资源中,减少数据移动的延迟和能耗,应对后摩尔时代的计算挑战。SRAM和RRAM是未来存算一体介质的研究重点,尤其RRAM在神经网络计算中有优势,尽管仍需技术成熟。文章提到,存算一体技术主要应用于AI、元宇宙和低功耗场景,目前行业处于起步阶段,企业正探索不同算力需求的解决方案。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

一、 存算一体化

概念
简单来说就是将存储资源中嵌入计算能力,以新的运算架构进行二维和三维矩阵乘法/加法运算。这样减少了数据频繁移动带来的延迟和能耗方面的开销。

背景
虽然多核(例如CPU)/众核(例如GPU)并行加速技术也能提升算力,但在后摩尔时代,存储带宽制约了计算系统的有效带宽,芯片算力增长步履维艰。
特别是,深度学习加速的最大挑战就是数据在计算单元和存储单元之间频繁的移动。

解决方案:
存内计算和存内逻辑,即存算一体技术直接利用存储器进行数据处理或计算,从而把数据存储与计算融合在同一个芯片的同一片区之中,可以彻底消除冯诺依曼计算架构瓶颈。

优势:

  1. 减少不必要的数据搬运。(降低能耗至1/10~1/100)
  2. 使用存储单元参与逻辑计算提升算力。(等效于在面积不变的情况下规模化增加计算核心数)

市场需求
存算一体的商业驱动力主要源于AI和元宇宙算力的需求、并行计算在深度学习的广泛应用。看向应用端,存算一体的市场发展驱动却是非常强烈的。

二、存算一体的存储介质对比

目前可用于存算一体的成熟存储器有NOR FLASH、SRAM、DRAM、RRAM、MRAM等NVRAM。

FLASH:早期创业企业所用FLASH属于非易失性存储介质,具有低成本、高可靠性的优势,但在工艺制程有明显的瓶颈。

SRAM:SRAM在速度方面和能效比方面具有优势,特别是在存内逻辑技术发展起来之后具有明显的高能效和高精度特点。

DRAM:DRAM成本低,容量大,但是速度慢,且需要电力不断刷新。

其他:适用存算一体的新型存储器有PCAM、MRAM、RRAM和FRAM等。其中忆阻器RRAM在神经网络计算中具有特别的优势,是除了SRAM存算一体之外的,下一代存算一体介质的主流研究方向。目前RRAM距离工艺成熟还需要2-5年,材料不稳定,但RRAM具有高速、结构简单的优点,有希望成为未来发展最快的新型存储器。

从学术界和工业界的研发趋势上看,SRAM和RRAM都是未来主流的存算一体介质。

几种存储介质对比:

在这里插入图片描述

三、总结

参考文章:陈巍:存算一体技术是什么?

存算一体已经被知名研究机构和产业方确定为下一代技术趋势之一。

目前国内外存算一体企业,都是刚刚起步阶段,差距尚不大。存算一体芯片在设计层面是创新的,没有成熟的方法借用。

目前行业主要两类路径,一类是从小算力1TOPS开始往上走,解决的是音频类、健康类及低功耗视觉终端侧应用场景,AI落地的芯片性能以及功耗问题。

另一类主要是针对大算力场景>100TOPS,解决大算力问题,在无人车、泛机器人、智能驾驶,云计算领域提供高性能大算力和高性价比的产品。

This book covers a verity of topics, including in-memory data grid, highly available service grid, streaming (event processing for IoT and fast data) and in-memory computing use cases from high-performance computing to get performance gains. The book will be particularly useful for those, who have the following use cases: You have a high volume of ACID transactions in your system. You have database bottleneck in your application and want to solve the problem. You want to develop and deploy Microservices in a distributed fashion. You have an existing Hadoop ecosystem (OLAP) and want to improve the performance of map/reduce jobs without making any changes in your existing map/reduce jobs.. You want to share Spark RDD directly in-memory (without storing the state into the disk), which can dramatically increase the performance of the Spark jobs. You are planning to migrate to microservices and the web session clustering is the problem for you. You are planning to process continuous never-ending streams and complex events of data in a scalable and fault-tolerant fashion. You want to use distributed computations in parallel fashion to gain high performance, low latency, and linear scalability. You want to accelerate applications performance without changing code. What you will learn: In-memory data fabrics use-cases and how it can help you to develop near real-time applications. In-memory data fabrics detail architecture. Caching strategies and how to use In-memory caching to improve the performance of the applications. SQL grid for in-memory caches. How to accelerates the performance of your existing Hadoop ecosystem without changing any code. Sharing Spark RDD states between different Spark applications for improving performance. Processing events & streaming data, integrate Apache Ignite with other frameworks like Storm, Camel, etc. Using distributed computing for building low-latency software. Developing distributed Microservices in fault-tolerant fashion. For every topic, a complete application is delivered, which will help the audience to quick start with the topic. The book is a project-based guide, where each chapter focuses on the complete implementation of a real-world scenario, the commonly occurring challenges in each scenario has also discussed, along with tips and tricks and best practices on how to overcome them. Every chapter is independent and a complete project.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值