什么是“缓存友好”代码?

本文翻译自:What is a “cache-friendly” code?

What is the difference between " cache unfriendly code " and the " cache friendly " code? 缓存不友好的代码 ”和“ 缓存友好的 ”代码之间有什么区别?

How can I make sure I write cache-efficient code? 如何确定我编写的高效缓存代码?


#1楼

参考:https://stackoom.com/question/184Eh/什么是-缓存友好-代码


#2楼

Preliminaries 预赛

On modern computers, only the lowest level memory structures (the registers ) can move data around in single clock cycles. 在现代计算机上,只有最低级别的内存结构( 寄存器 )才能在单个时钟周期内移动数据。 However, registers are very expensive and most computer cores have less than a few dozen registers (few hundred to maybe a thousand bytes total). 但是,寄存器非常昂贵,并且大多数计算机内核都只有不到几十个寄存器(总计几百到千个字节 )。 At the other end of the memory spectrum ( DRAM ), the memory is very cheap (ie literally millions of times cheaper ) but takes hundreds of cycles after a request to receive the data. 在内存频谱( DRAM )的另一端,内存非常便宜(即便宜了数百万倍 ),但是在请求接收数据后需要花费数百个周期。 To bridge this gap between super fast and expensive and super slow and cheap are the cache memories , named L1, L2, L3 in decreasing speed and cost. 为了弥合超快和昂贵以及超慢和便宜之间的差距, 缓存存储器以降低的速度和成本命名为L1,L2,L3。 The idea is that most of the executing code will be hitting a small set of variables often, and the rest (a much larger set of variables) infrequently. 这个想法是,大多数执行代码经常会碰到一小组变量,而其余部分(一大组变量)则很少。 If the processor can't find the data in L1 cache, then it looks in L2 cache. 如果处理器无法在L1高速缓存中找到数据,那么它将在L2高速缓存中查找。 If not there, then L3 cache, and if not there, main memory. 如果不存在,则L3缓存,如果不存在,则为主内存。 Each of these "misses" is expensive in time. 这些“缺失”中的每一个在时间上都是昂贵的。

(The analogy is cache memory is to system memory, as system memory is too hard disk storage. Hard disk storage is super cheap but very slow). (类比是高速缓存是系统内存,因为系统内存太硬盘存储。硬盘存储非常便宜,但速度很慢)。

Caching is one of the main methods to reduce the impact of latency . 缓存是减少延迟影响的主要方法之一。 To paraphrase Herb Sutter (cfr. links below): increasing bandwidth is easy, but we can't buy our way out of latency . 解释一下Herb Sutter(下面的链接): 增加带宽很容易,但是我们不能摆脱延迟

Data is always retrieved through the memory hierarchy (smallest == fastest to slowest). 始终通过内存层次结构检索数据(最小==最快到最慢)。 A cache hit/miss usually refers to a hit/miss in the highest level of cache in the CPU -- by highest level I mean the largest == slowest. 高速缓存命中/未命中通常是指CPU最高级别的高速缓存中的命中/未命中-最高水平是指最大==最慢。 The cache hit rate is crucial for performance since every cache miss results in fetching data from RAM (or worse ...) which takes a lot of time (hundreds of cycles for RAM, tens of millions of cycles for HDD). 高速缓存命中率对于性能至关重要,因为每次高速缓存未命中都会导致从RAM中获取数据(或更糟的是...),这需要花费大量时间(RAM需要数百个周期,HDD需要数千万个周期)。 In comparison, reading data from the (highest level) cache typically takes only a handful of cycles. 相比之下,从(最高级别)高速缓存读取数据通常只需要几个周期。

In modern computer architectures, the performance bottleneck is leaving the CPU die (eg accessing RAM or higher). 在现代计算机体系结构中,性能瓶颈正在使CPU失效(例如访问RAM或更高版本)。 This will only get worse over time. 随着时间的推移,这只会变得更糟。 The increase in processor frequency is currently no longer relevant to increase performance. 当前,处理器频率的增加不再与提高性能有关。 The problem is memory access. 问题是内存访问。 Hardware design efforts in CPUs therefore currently focus heavily on optimizing caches, prefetching, pipelines and concurrency. 因此,CPU中的硬件设计工作目前主要集中在优化缓存,预取,管道和并发性上。 For instance, modern CPUs spend around 85% of die on caches and up to 99% for storing/moving data! 例如,现代的CPU将大约85%的芯片消耗在高速缓存上,并将高达99%的芯片用于存储/移动数据!

There is quite a lot to be said on the subject. 关于这个话题有很多要说的。 Here are a few great references about caches, memory hierarchies and proper programming: 这是有关缓存,内存层次结构和正确编程的一些出色参考:

  • 4
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值