CPU Cache Flushing Fallacy

Even from highly experienced technologists I often hear talk about how certain operations cause a CPU cache to “flush”.  This seems to be illustrating a very common fallacy about how CPU caches work, and how the cache sub-system interacts with the execution cores.  In this article I will attempt to explain the function CPU caches fulfil, and how the cores, which execute our programs of instructions, interact with them.  For a concrete example I will dive into one of the latest Intel x86 server CPUs.  Other CPUs use similar techniques to achieve the same ends.

Most modern systems that execute our programs are shared-memory multi-processor systems in design.  A shared-memory system has a single memory resource that is accessed by 2 or more independent CPU cores.  Latency to main memory is highly variable from 10s to 100s of nanoseconds.  Within 100ns it is possible for a 3.0GHz CPU to process up to 1200 instructions.  Each Sandy Bridge core is capable of retiring up to 4 instructions-per-cycle (IPC) in parallel.  CPUs employ cache sub-systems to hide this latency and allow them to exercise their huge capacity to process instructions.  Some of these caches are small, very fast, and local to each core; others are slower, larger, and shared across cores.  Together with registers and main-memory, these caches make up our non-persistent memory hierarchy.

Next time you are developing an important algorithm, try pondering that a cache-miss is a lost opportunity to have executed ~500 CPU instructions!  This is for a single-socket system, on a multi-socket system you can effectively double the lost opportunity as memory requests cross socket interconnects.


文章转自 并发编程网-ifeve.com

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值