存内计算(Processing in Memory,PIM)的由来

前言

  今天漫无目的浏览2024年HPCA收录的文章,第一眼看到的自然是best paper,它是讲关于PIM的性能优化的,恰好我对PIM不是很了解,于是今天就选定PIM进行介绍吧!

一、PIM介绍

1、什么是PIM

  PIM是指将处理器嵌入到内存中形成存内计算芯片的一种架构。其允许在计算机、服务器或类似设备的内存中执行计算和处理。

2、PIM出现的背景

冯诺依曼瓶颈

  冯诺依曼瓶颈就是PIM出现的背景。在冯诺伊曼架构下,计算机主要包括中央处理器(CPU),内存单元和输入输出设备。其中CPU包括控制单元、计算单元和寄存器。由此可知,CPU和内存是互相独立存在的组件,它们之间需要依靠系统总线进行数据传输,这就引来了一定的数据传输延迟。后来,随着各种各样应用的出现,伴随着各种变化的工作负载,引入了冯诺依曼架构的第一个小瓶颈,即执行效率不够高。因此大家通过提高CPU的处理速率去提高执行速度,通过增加内存密度去扩大内存容量以存储更多数据,减少从磁盘读IO数据时产生的延迟。这虽然在很大程度上去提升了IO请求的执行效率,但是,CPU处理器还是会陷入等待,原因就是内存和CPU之间总是不可避免地会产生一定的数据传输延迟时间。当CPU需要处理某些数据时,需要进入等待时间,直到数据从内存中经过数据总线完整地传输到CPU中,这就是现在的冯诺伊曼瓶颈,也称为内存墙。

Cache

  伴随着这个问题,cache出现了,即L1、L2和L3 cache。cache诞生的原因是利用局部性原理(数据的时间局部性空间局部性)来提升数据访问的性能。再啰嗦一点,时间局部性就是指如果一个数据近期被访问,那么在不久的将来会被再次访问;空间局部性是指如果一个数据被访问,那么它附近的数据也很有可能被访问。于是CPU通过将这些相关数据存储到cache中,以提升其数据访问性能。
  这三个cache介于CPU和内存之间,通过缩短距离来降低传输时间带来的延迟。这三个cache的主要区别是容量和传输速度(与CPU距离的远近,如下图所示,注意,其中),其中L1 cache最小、最快,L2稍微慢一些,但容量增加了;而L3 cache最慢也最大。这三个cache本质上都是SRAM(static random access memeory,静态随机访问内存存储器),它每个存储单元需要由六个晶体管构成,集成度较低,但不需要不断刷新电路进行数据保存,而又由于CPU管芯面积不能太大,因此这三个cache的容量相对受限。
  其中L1 cache和L2 cache是嵌在CPU里面的,L1 cache大小为64KB,一个CPU core 对应一个L1 cache和一个L2 cache,多个CPU core共享一个L3 cache,因此如果当前CPU是四核CPU,则该CPU总共有4个L1 cache 、 4 个L2 cache 和 1 个L3 cache。而为什么L1 cache容量这么小,不做得大一些呢?这是由于为了让L1 cache能够提供高速的查找能力而做的tradeoff,试想以下,在小盒子里找东西快还是在大盒子找东西快?
cache

NDP(Near-data processing)-- PIM

  NDP原理就是直接将处理单元放在数据附近,直接进行计算,摆脱了传统冯诺依曼架构 — 将数据传输到处理器之后才能进行处理,从根源上解决了数据传输导致数据处理延迟的问题。其中,PIM就是这个这个原理的一种类型。通过在内存中嵌入计算单元,为CPU分担一些计算任务。

个人总结

  由于今天还有别的任务要做,所以今天先讲到这里啦,今天的介绍可能有点简单,后面我会继续深入讲解的!好啦,祝大家周末愉快~

This book covers a verity of topics, including in-memory data grid, highly available service grid, streaming (event processing for IoT and fast data) and in-memory computing use cases from high-performance computing to get performance gains. The book will be particularly useful for those, who have the following use cases: You have a high volume of ACID transactions in your system. You have database bottleneck in your application and want to solve the problem. You want to develop and deploy Microservices in a distributed fashion. You have an existing Hadoop ecosystem (OLAP) and want to improve the performance of map/reduce jobs without making any changes in your existing map/reduce jobs.. You want to share Spark RDD directly in-memory (without storing the state into the disk), which can dramatically increase the performance of the Spark jobs. You are planning to migrate to microservices and the web session clustering is the problem for you. You are planning to process continuous never-ending streams and complex events of data in a scalable and fault-tolerant fashion. You want to use distributed computations in parallel fashion to gain high performance, low latency, and linear scalability. You want to accelerate applications performance without changing code. What you will learn: In-memory data fabrics use-cases and how it can help you to develop near real-time applications. In-memory data fabrics detail architecture. Caching strategies and how to use In-memory caching to improve the performance of the applications. SQL grid for in-memory caches. How to accelerates the performance of your existing Hadoop ecosystem without changing any code. Sharing Spark RDD states between different Spark applications for improving performance. Processing events & streaming data, integrate Apache Ignite with other frameworks like Storm, Camel, etc. Using distributed computing for building low-latency software. Developing distributed Microservices in fault-tolerant fashion. For every topic, a complete application is delivered, which will help the audience to quick start with the topic. The book is a project-based guide, where each chapter focuses on the complete implementation of a real-world scenario, the commonly occurring challenges in each scenario has also discussed, along with tips and tricks and best practices on how to overcome them. Every chapter is independent and a complete project.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值