并发—— LongAdder源码分析

最新推荐文章于 2022-08-16 17:25:08 发布

m0_58568357

最新推荐文章于 2022-08-16 17:25:08 发布

阅读量110

点赞数

分类专栏： java 文章标签： java 多线程

本文链接：https://blog.csdn.net/m0_58568357/article/details/119719026

版权

java 专栏收录该内容

12 篇文章 0 订阅

订阅专栏

LongAdder 源码分析

1.AtomicLong 和 LongAdder 的性能比较
2. 分析原因 AtomicLong
3.LongAdder 分析
- 3.1 类关系
4.总结

1.AtomicLong 和 LongAdder 的性能比较

直接上代码
开启16个线程，每个线程执行1000万次的自增；

AtomicLong

public class Main1 {

    private static AtomicLong count = new AtomicLong(0);
    private final static int THREAD_COUNTS = 16;
    private final static int INCREMENT_COUNT = 10000000;

    public static void main(String[] args) throws Exception{
        long str = System.currentTimeMillis();
        List<Thread> list = new ArrayList<>();
        for (int i = 0; i < THREAD_COUNTS; i++) {
            list.add(new Thread(()->{
                for (int i1 = 0; i1 < INCREMENT_COUNT; i1++) {
                    count.incrementAndGet();
                }
            }));
        }

        for (int i = 0; i < list.size(); i++) {
            list.get(i).start();
        }
        for (int i = 0; i < list.size(); i++) {
            list.get(i).join();
        }
        long end = System.currentTimeMillis();
        System.out.println("花费的时间： " + (end - str));
    }
}

执行结果为：
在这里插入图片描述

LongAdder

public class Main2 {

    private static LongAdder count = new LongAdder();
    private final static int THREAD_COUNTS = 16;
    private final static int INCREMENT_COUNT = 10000000;

    public static void main(String[] args) throws Exception{
        long str = System.currentTimeMillis();
        List<Thread> list = new ArrayList<>();
        for (int i = 0; i < THREAD_COUNTS; i++) {
            list.add(new Thread(()->{
                for (int i1 = 0; i1 < INCREMENT_COUNT; i1++) {
                    count.increment();
                }
            }));
        }

        for (int i = 0; i < list.size(); i++) {
            list.get(i).start();
        }
        for (int i = 0; i < list.size(); i++) {
            list.get(i).join();
        }
        long end = System.currentTimeMillis();
        System.out.println("花费的时间： " + (end - str));
    }
}

执行结果：
在这里插入图片描述
从执行结果就可以很明显的看出使用LongAdder 在执行原子更新带来的性能提升是非常巨大的，所以推荐在高并发场景下，使用LongAdder来作为原子更新的首选。

2. 分析原因 AtomicLong

首先看源码

这里调用unsafe 对象来做原子更新，看注释可以知道这里仅仅是在一个value 变量上进行原子操作

首先拿到当前对象的value 值。然后尝试cas 将其增加1；

小结：
通过观察源码可以知道AtomicLong 类似的原子类，在操作时，仅仅时在内部对一个变量进行更新操作，如果在单线程下这样没有问题，但是如果是在多线程下，多个线程同时取竞争更新该变量，就会产生大量的线程竞争导致更新失败，然后循环继续CAS，知道跟新成功为止，就会白白的浪费CPU资源。为此引入了LongAdder。

为了便于理解这里先直接上结论：
LongAdder 原子更新提高效率的本质就是降低线程CAS的竞争，将原本的一个增加变为多个，每个线程可以在自己的Cell 对象中做自增操作，进而提高了自增效率。

3.LongAdder 分析

首先分析LongAdder 这个类

3.1 类关系

在这里插入图片描述
继承了Striped64，实现了序列化接口，重点看Striped64 类。

3.11 Striped64 类

Cell 内部类

在这里插入图片描述
类中有一个value变量 根据前面AtomicLong 的分析，LongAdder 的跟新操作就是对这个Cell 类中的value 变量的跟新。

@sun.misc.Contended 注解
这个注解的作用就是防止伪共享

在介绍伪共享之前，先看一下计算机的缓存机制

3.12 计算机缓存 —— 伪共享

缓存的引入主要是为了解决IO操作和处理器执行效率的不匹配问题，为了提高计算机的执行效率，现代的计算机一般都是多核处理器，并且处理器的缓存模型大致如下：

在这里插入图片描述
每个处理器有自己的1，2级缓存，并且共享一个三级缓存，从上到下处理器访问的速率依次降低。同时由于使用了缓存的原因，每个处理器的缓存中都保存了内存中的数据的副本，CPU就需要保证数据的一致性，如果某个CPU修改的自己缓存中的数据，就需要让其他的处理的对应数据的整个缓存行失效。一旦某个数据的缓存行失效，那么当下一次访问的时候就需要从内存中访问。

缓存行
现代处理器的缓存行的大小一般为64个字节，而Cell 对象的大小为24个字节。 16 字节的对象头，8 字节的value 变量。所以在一个缓存行中就可以存放下2个Cell 对象

并且如果这两个Cell 对象来自不同的线程中，那么如果其中一个线程对自己缓存中的Cell 对象执行了自增操作，那么就会导致另一个线程中的对应的缓存行失效，这就会影响另一个线程的自增，这就是伪共享。为了防止这种情况的出现，就使用了这个注解@sun.misc.Contended 注解

该注解的作用就是扩大当前对象的大小，扩充到128个字节，这样就可以保证Cells 数组中的每个Cell 对象固定在不同的缓存行，进而不同Cell 的更新就不会相互影响。

3.13 成员

    /**
     * Table of cells. When non-null, size is a power of 2.
     */
    transient volatile Cell[] cells;

    /**
     * Base value, used mainly when there is no contention, but also as
     * a fallback during table initialization races. Updated via CAS.
     */
    transient volatile long base;

    /**
     * Spinlock (locked via CAS) used when resizing and/or creating Cells.
     */
    transient volatile int cellsBusy;

Cells 存储累加对象的数组
base 如果没有竞争，直接使用base 来进行累加
cellBusy 这是CAS 方式实现的锁。 0 ：表示加锁 1 ：释放锁

接着看一下自增方法：

3.13 increment（）

    public void add(long x) {
        Cell[] as; long b, v; int m; Cell a;
        这里首先判断Cells 对象是否null，如果不为null，直接进行内部。 
        如果为null，表示当前还没有线程竞争更新LongAdder ，那么就直接更新base 变量
        if ((as = cells) != null || !casBase(b = base, b + x)) {

走到这里就表示两种情况：
             1：Cells 对象不为null，有多个线程在更新， 
             2. base 更新失败，当前出现了一个新的线程竞争更新
            boolean uncontended = true;
           
           1: 如果Cells 对象为null，就立即进入，执行longAccumulate 创建cells对象
           2：如果Cells 对象存在，但是cell 对象还没创建，那么就进入创建cell 对象
           3：Cells 对象存在 且Cell 对象存在，就尝试 进行cas更新，失败则进入    longAccumulate
            if (as == null || (m = as.length - 1) < 0 ||
                (a = as[getProbe() & m]) == null ||
                !(uncontended = a.cas(v = a.value, v + x)))
                longAccumulate(x, null, uncontended);
        }
    }

longAccumulate（）

    final void longAccumulate(long x, LongBinaryOperator fn,
                              boolean wasUncontended) {
        int h;
        if ((h = getProbe()) == 0) {
            ThreadLocalRandom.current(); // force initialization
            h = getProbe();
            wasUncontended = true;
        }
        boolean collide = false;                // True if last slot nonempty
       
       整个方法的关键步骤是在一个死循环中，
        1：Cells 不为null，且cell 存在，则尝试获取Cell，
                  获取成功不为null，则尝试CAS 更新，更新失败则继续循环，反则则返回
                  如果为Cell为null，就创建一个新Cell 对象，这里体现了懒加载机制，如果没使用到Cell 就会创建。在创建的过程使用的是casBusy 作为flag 的cas 锁。     
        2. Cell更新失败，就继续向下走，判断当前的Cells是否满了，如果满了的化，在下一次的循环就会扩充Cells 数组的大小，扩充为原来的两倍         
       for (;;) {
            Cell[] as; Cell a; int n; long v;
            1：
            if ((as = cells) != null && (n = as.length) > 0) {
                if ((a = as[(n - 1) & h]) == null) {
                    if (cellsBusy == 0) {       // Try to attach new Cell
                        Cell r = new Cell(x);   // Optimistically create
                        if (cellsBusy == 0 && casCellsBusy()) {
                            boolean created = false;
                            try {               // Recheck under lock
                                Cell[] rs; int m, j;
                                if ((rs = cells) != null &&
                                    (m = rs.length) > 0 &&
                                    rs[j = (m - 1) & h] == null) {
                                    rs[j] = r;
                                    created = true;
                                }
                            } finally {
                                cellsBusy = 0;
                            }
                            if (created)
                                break;
                            continue;           // Slot is now non-empty
                        }
                    }
                    collide = false;
                }
                else if (!wasUncontended)       // CAS already known to fail
                    wasUncontended = true;      // Continue after rehash
                else if (a.cas(v = a.value, ((fn == null) ? v + x :
                                             fn.applyAsLong(v, x))))
                    break;
                2：
                else if (n >= NCPU || cells != as)
                    collide = false;            // At max size or stale
                else if (!collide)
                    collide = true;
                else if (cellsBusy == 0 && casCellsBusy()) {
                    try {
                        if (cells == as) {      // Expand table unless stale
                            Cell[] rs = new Cell[n << 1];
                            for (int i = 0; i < n; ++i)
                                rs[i] = as[i];
                            cells = rs;
                        }
                    } finally {
                        cellsBusy = 0;
                    }
                    collide = false;
                    continue;                   // Retry with expanded table
                }
                这个方法就是打乱一下新的Cells 数组中元素的排列，是Cell 分布更加均匀
                h = advanceProbe(h);
            }

3.   这就就是Cells 为null的情况， 
            1.创建一个新的Cells 数组
            2.创建一个Cell 对象，设置值为1；
            else if (cellsBusy == 0 && cells == as && casCellsBusy()) {
                boolean init = false;
                try {                           // Initialize table
                    if (cells == as) {
                        Cell[] rs = new Cell[2];
                        rs[h & 1] = new Cell(x);
                        cells = rs;
                        init = true;
                    }
                } finally {
                    cellsBusy = 0;
                }
                if (init)
                    break;
            }
            else if (casBase(v = base, ((fn == null) ? v + x :
                                        fn.applyAsLong(v, x))))
                break;                          // Fall back on using base
        }
    }

可以看到这里如果出现多个线程竞争时，会对Cells 进行扩容，那么如果当前的Cells 的长度大于了服务器的处理器的数量，那么依然会产生竞争，就会导致自增效率的降低。因为每个cell 占用128 个字节非常的大，所以这里使用的懒加载机制，并不会直接创建全部的cell。

大量线程竞争效率测试：
* 16个线程
在这里插入图片描述

* 64个线程

在这里插入图片描述

* 128个线程

在这里插入图片描述

可以明显看到线程的增加，执行耗时也增加。

AtomicLong 128 个线程

AtomicLong 在128 个线程下和LongAdder 执行效率差了 10多倍！！！，所以高并发场景下自增时 LongAdder 一定是首选的。

3.14 sum（）

    public long sum() {
        Cell[] as = cells; Cell a;
        long sum = base;
        if (as != null) {
            for (int i = 0; i < as.length; ++i) {
                if ((a = as[i]) != null)
                    sum += a.value;
            }
        }
        return sum;
    }

sum() 方法的实现就比较简单，就是遍历Cells 的累计所有cell 的值 + base 的；

4.总结

从LongAdder 中学习到了：

1.懒加载的方式创建对象可以更加合理的降低内存的使用，提高程序的加载速度。
2. 为了防止同一个缓存行中不同变量的修改会对其他变量失效，使用@sun.misc.Contended 来扩大每个对象的大小的方式，将不同对象存储在不同缓存行中。
3. cas 实现锁 —— 这里为什么不使用其他的锁，比如ReetrantLock？？？

m0_58568357

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
并发—— LongAdder源码分析

LongAdder 源码分析1.AtomicLong 和 LongAdder 的性能比较2. 分析原因 AtomicLong3.LongAdder 分析3.1 类关系3.11 Striped64 类3.12 计算机缓存 —— 伪共享3.13 成员3.13 increment（）3.14 sum（）4.总结1.AtomicLong 和 LongAdder 的性能比较直接上代码开启16个线程，每个线程执行1000万次的自增；AtomicLongpublic class Main1 { p
复制链接

扫一扫