CPU3级缓存

JebLin02

已于 2023-12-02 16:56:47 修改

阅读量1.3k

点赞数

文章标签：缓存 jvm java

于 2023-07-03 10:24:28 首次发布

本文链接：https://blog.csdn.net/Sword52888/article/details/125874443

版权

CPU Cache 通常分为大小不等的三级缓存，分别是 L1 Cache、L2 Cache 和 L3 Cache。其中L3是多个核心共享的。

程序执行时，会先将内存中的数据加载到共享的 L3 Cache 中，再加载到每个核心独有的 L2 Cache，最后进入到最快的 L1 Cache，之后才会被 CPU 读取。之间的层级关系，如下图。

一级缓存、二级缓存，核心独享

三级缓存，多核心共享

越靠近CPU，缓存速度访问越快

1个时钟周期 = CPU 主频的倒数，比如 2GHZ 主频的CPU，一个时钟周期 = 0.5ns

比如jd上的12代酷睿 i9-12900K 处理器，如果主频到5Ghz的话，那么一个时钟周期 = 0.2ns

cpu cache 读取过程

CPU Cache 的数据是从内存中读取过来的，以一小块一小块读取数据的，而不是按照单个数组元素来

读取数据的，在 CPU Cache 中的，这样一小块一小块的数据，称为 Cache Line(缓存行)

[root@iZ2zej4i2jdf3gutsuefm1Z]$ ll  /sys/devices/system/cpu/cpu0/cache/
总用量 0
drwxr-xr-x 2 root root 0 7月  19 17:02 index0 //L1数据缓存
drwxr-xr-x 2 root root 0 10月 28 2019 index1 //L1指令缓存
drwxr-xr-x 2 root root 0 10月 28 2019 index2 //L2数据缓存
drwxr-xr-x 2 root root 0 10月 28 2019 index3 //L3数据缓存

#查看cache line 大小 一般都是64，64个字节 byte
[root@iZ2zej4i2jdf3gutsuefm1Z index0]$ cat /sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size
64

#查看各级缓存大小
//L1数据缓存 32K
[root@iZ2zej4i2jdf3gutsuefm1]$ cat /sys/devices/system/cpu/cpu0/cache/index0/size
32K 

//L1命令缓存 32K
[root@iZ2zej4i2jdf3gutsuefm1]$ cat /sys/devices/system/cpu/cpu0/cache/index1/size
32K

//L2数据缓存 256K
[root@iZ2zej4i2jdf3gutsuefm1]$ cat /sys/devices/system/cpu/cpu0/cache/index2/size
256K

//L3数据缓存 40M
[root@iZ2zej4i2jdf3gutsuefm1]$ cat /sys/devices/system/cpu/cpu0/cache/index3/size
40960K

比如，有一个 int array[100] 的数组，当载入 array[0] 时，由于这个数组元素的大小在内存只占 4 字节，不足 64 字节，此时需要16个元素才能凑满64个字节，那么CPU 就会顺序加载数组元素到 array[15] ，意味着 array[0]~array[15] 数组元素都会被缓存在 CPU Cache 中了，因此当下次访问这些数组元素时，会直接从 CPU Cache 读取，而不用再从内存中读取，大大提高了 CPU 读取数据的性。

特点：

缓存行越大，局部性空间效率越高，但读取时间慢。

缓存行越小，局部性空间效率越低，但读取时间快。

64字节目前是最合适的！

缓存一致性：

public class CAcheLineMain {


    // 每个缓存行CacheLine 占用64个字节
    public static volatile long[] arr1 = new long[2]; // 每个long占用8个字节
    public static volatile long[] arr2 = new long[16]; // 每个long占用8个字节

    public static void main(String[] args) throws InterruptedException {
//        testCacheLine1();
        testCacheLine2();
    }


    public static void testCacheLine1() throws InterruptedException {
        Thread t1 = new Thread(() -> {
            for (int i = 0; i < 20_0000_0000L; i++) {
                arr1[0] = i;
            }
        });
        Thread t2 = new Thread(() -> {
            for (int i = 0; i < 20_0000_0000L; i++) {
                arr1[1] = i; //注意这里的区别
            }
        });

        final long start = System.nanoTime();
        t1.start();
        t2.start();
        t1.join();
        t2.join();
        System.out.println((System.nanoTime() - start) / 100_0000 + " ms");
    }

    public static void testCacheLine2() throws InterruptedException {
        Thread t1 = new Thread(() -> {
            for (int i = 0; i < 20_0000_0000L; i++) {
                arr2[0] = i;
            }
        });
        Thread t2 = new Thread(() -> {
            for (int i = 0; i < 20_0000_0000L; i++) {
                arr2[8] = i; //注意这里的区别， 每个long占用8个字节，所以下标8表示第9个元素，也就是与上面的下标0的元素不处于同一缓存行
            }
        });

        final long start = System.nanoTime();
        t1.start();
        t2.start();
        t1.join();
        t2.join();
        System.out.println((System.nanoTime() - start) / 100_0000 + " ms");
    }


}