什么是“缓存友好”代码？

最新推荐文章于 2022-03-29 21:18:30 发布

asdfgh0077

最新推荐文章于 2022-03-29 21:18:30 发布

阅读量1.5k

点赞数 4

文章标签： c++ performance caching memory cpu-cache

原文链接：https://oldbug.net/q/184Eh/What-is-a-cache-friendly-code

版权

本文翻译自：What is a “cache-friendly” code?

What is the difference between " cache unfriendly code " and the " cache friendly " code? “ 缓存不友好的代码 ”和“ 缓存友好的 ”代码之间有什么区别？

How can I make sure I write cache-efficient code? 如何确定我编写的高效缓存代码？

#1楼

参考：https://stackoom.com/question/184Eh/什么是-缓存友好-代码

#2楼

Preliminaries 预赛

On modern computers, only the lowest level memory structures (the registers ) can move data around in single clock cycles. 在现代计算机上，只有最低级别的内存结构（ 寄存器 ）才能在单个时钟周期内移动数据。 However, registers are very expensive and most computer cores have less than a few dozen registers (few hundred to maybe a thousand bytes total). 但是，寄存器非常昂贵，并且大多数计算机内核都只有不到几十个寄存器（总计几百到千个字节 ）。 At the other end of the memory spectrum ( DRAM ), the memory is very cheap (ie literally millions of times cheaper ) but takes hundreds of cycles after a request to receive the data. 在内存频谱（ DRAM ）的另一端，内存非常便宜（即便宜了数百万倍），但是在请求接收数据后需要花费数百个周期。 To bridge this gap between super fast and expensive and super slow and cheap are the cache memories , named L1, L2, L3 in decreasing speed and cost. 为了弥合超快和昂贵以及超慢和便宜之间的差距， 缓存存储器以降低的速度和成本命名为L1，L2，L3。 The idea is that most of the executing code will be hitting a small set of variables often, and the rest (a much larger set of variables) infrequently. 这个想法是，大多数执行代码经常会碰到一小组变量，而其余部分（一大组变量）则很少。 If the processor can't find the data in L1 cache, then it looks in L2 cache. 如果处理器无法在L1高速缓存中找到数据，那么它将在L2高速缓存中查找。 If not there, then L3 cache, and if not there, main memory. 如果不存在，则L3缓存，如果不存在，则为主内存。 Each of these "misses" is expensive in time. 这些“缺失”中的每一个在时间上都是昂贵的。

(The analogy is cache memory is to system memory, as system memory is too hard disk storage. Hard disk storage is super cheap but very slow). （类比是高速缓存是系统内存，因为系统内存太硬盘存储。硬盘存储非常便宜，但速度很慢）。

Caching is one of the main methods to reduce the impact of latency . 缓存是减少延迟影响的主要方法之一。 To paraphrase Herb Sutter (cfr. links below): increasing bandwidth is easy, but we can't buy our way out of latency . 解释一下Herb Sutter（下面的链接）： 增加带宽很容易，但是我们不能摆脱延迟 。

Data is always retrieved through the memory hierarchy (smallest == fastest to slowest). 始终通过内存层次结构检索数据（最小==最快到最慢）。 A cache hit/miss usually refers to a hit/miss in the highest level of cache in the CPU -- by highest level I mean the largest == slowest. 高速缓存命中/未命中通常是指CPU最高级别的高速缓存中的命中/未命中-最高水平是指最大==最慢。 The cache hit rate is crucial for performance since every cache miss results in fetching data from RAM (or worse ...) which takes a lot of time (hundreds of cycles for RAM, tens of millions of cycles for HDD). 高速缓存命中率对于性能至关重要，因为每次高速缓存未命中都会导致从RAM中获取数据（或更糟的是...），这需要花费大量时间（RAM需要数百个周期，HDD需要数千万个周期）。 In comparison, reading data from the (highest level) cache typically takes only a handful of cycles. 相比之下，从（最高级别）高速缓存读取数据通常只需要几个周期。

In modern computer architectures, the performance bottleneck is leaving the CPU die (eg accessing RAM or higher). 在现代计算机体系结构中，性能瓶颈正在使CPU失效（例如访问RAM或更高版本）。 This will only get worse over time. 随着时间的推移，这只会变得更糟。 The increase in processor frequency is currently no longer relevant to increase performance. 当前，处理器频率的增加不再与提高性能有关。 The problem is memory access. 问题是内存访问。 Hardware design efforts in CPUs therefore currently focus heavily on optimizing caches, prefetching, pipelines and concurrency. 因此，CPU中的硬件设计工作目前主要集中在优化缓存，预取，管道和并发性上。 For instance, modern CPUs spend around 85% of die on caches and up to 99% for storing/moving data! 例如，现代的CPU将大约85％的芯片消耗在高速缓存上，并将高达99％的芯片用于存储/移动数据！

There is quite a lot to be said on the subject. 关于这个话题有很多要说的。 Here are a few great references about caches, memory hierarchies and proper programming: 这是有关缓存，内存层次结构和正确编程的一些出色参考：

Agner Fog's page . Agner Fog的页面。 In his excellent documents, you can find detailed examples covering languages ranging from assembly to C++. 在他的出色文档中，您可以找到涵盖从汇编到C ++的语言的详细示例。
If you are into videos, I strongly recommend to have a look at Herb Sutter's talk on machine architecture (youtube) (specifically check 12:00 and onwards!). 如果您喜欢视频，我强烈建议您看一下Herb Sutter关于机器架构的演讲（youtube）（具体请检查12:00及之后！）。
Slides about memory optimization by Christer Ericson (director of technology @ Sony)

最低0.47元/天解锁文章

asdfgh0077

关注

4
点赞
踩
6

收藏

觉得还不错? 一键收藏
0
评论
什么是“缓存友好”代码？

What is the difference between " cache unfriendly code " and the " cache friendly " code? “ 缓存不友好的代
复制链接

扫一扫

什么是“缓存友好”代码？

#1楼

#2楼

Preliminaries 预赛

“相关推荐”对你有帮助么？