原子累加器——LongAdder&AtomicLong

Java都不学

已于 2022-09-13 23:22:48 修改

阅读量404

点赞数

分类专栏： JUC 文章标签： java

于 2022-09-13 00:41:51 首次发布

本文链接：https://blog.csdn.net/weixin_52383177/article/details/126824687

版权

JUC 专栏收录该内容

20 篇文章 0 订阅

订阅专栏

参考https://www.bilibili.com/video/BV16J411h7Rd

cell 数组创建、cell 创建的流程——longAccumulate方法

编辑获取最终结果通过 sum 方法

累加器性能比较

private static <T> void demo(Supplier<T> adderSupplier, Consumer<T> action) {
     T adder = adderSupplier.get();
     long start = System.nanoTime();
     List<Thread> ts = new ArrayList<>();
     // 4 个线程，每人累加 50 万
     for (int i = 0; i < 40; i++) {
     ts.add(new Thread(() -> {
         for (int j = 0; j < 500000; j++) {
             action.accept(adder);
         }
         }));
     }
     ts.forEach(t -> t.start());
     ts.forEach(t -> {
    try {
         t.join();
         } catch (InterruptedException e) {
             e.printStackTrace();
        }    
     });
     long end = System.nanoTime();
     System.out.println(adder + " cost:" + (end - start)/1000_000);
}

//比较 AtomicLong 与 LongAdder
for (int i = 0; i < 5; i++) {
 demo(() -> new LongAdder(), adder -> adder.increment());
}
for (int i = 0; i < 5; i++) {
 demo(() -> new AtomicLong(), adder -> adder.getAndIncrement());
}

LongAdder快于AtomicLong，性能提升的原因：在有竞争时，设置多个累加单元Cell，Therad-0 累加 Cells[0]，而 Thread-1 累加 Cells[1]... 最后将结果汇总。这样它们在累加时操作的不同的 Cell 变量，因此减少了 CAS 重试失败，从而提高性能。

LongAdder源码

成员变量

//继承Striped64（一个包本地类，为支持 64 位值动态条带化的类提供通用表示和机制。
//该类扩展了 Number 以便具体的子类必须公开这样做），且实现序列化
public class LongAdder extends Striped64 implements Serializable{}

abstract class Striped64 extends Number{
    /** Number of CPUS, to place bound on table size
        CPU 数量，限制表大小
     */
    static final int NCPU = Runtime.getRuntime().availableProcessors();

    /**
     * Table of cells. When non-null, size is a power of 2.
        单元格表。非空时，大小是 2 的幂，懒惰初始化
     */
    transient volatile Cell[] cells;

    /**
     * Base value, used mainly when there is no contention, but also as
     * a fallback during table initialization races. Updated via CAS.
        基值，主要在没有争用时使用，但也可作为表初始化竞赛期间的后备。通过 CAS 更新
     */
    transient volatile long base;

    /**
     * Spinlock (locked via CAS) used when resizing and/or creating Cells.
        调整大小和/或创建单元格时使用自旋锁（通过 CAS 锁定）1表示上锁，0表示未上锁
     */
    transient volatile int cellsBusy;

}

原理之伪共享

其中 Cell 即为累加单元
// 防止缓存行伪共享
@sun.misc.Contended
static final class Cell {
 volatile long value;
 Cell(long x) { value = x; }
 
 // 最重要的方法, 用来 cas 方式进行累加, prev 表示旧值, next 表示新值
 final boolean cas(long prev, long next) {
     return UNSAFE.compareAndSwapLong(this, valueOffset, prev, next);
 }
 // 省略不重要代码
}

缓存与内存的速度比较

因为 CPU 与内存的速度差异很大，需要靠预读数据至缓存来提升效率。而缓存以缓存行为单位，每个缓存行对应着一块内存，一般是 64 byte（8 个 long）缓存的加入会造成数据副本的产生，即同一份数据会缓存在不同核心的缓存行中 ，CPU 要保证数据的一致性，如果某个 CPU 核心更改了数据，其它 CPU 核心对应的整个缓存行必须失效。

因为 Cell 是数组形式，在内存中是连续存储的，一个 Cell 为 24 字节（16 字节的对象头和 8 字节的 value），因此缓存行可以存下 2 个的 Cell 对象。问题来了： Core-0 要修改 Cell[0] Core-1 要修改 Cell[1] 无论谁修改成功，都会导致对方 Core 的缓存行失效，比如 Core-0 中 Cell[0]=6000, Cell[1]=8000 要累加 Cell[0]=6001, Cell[1]=8000 ，这时会让 Core-1 的缓存行失效。

@sun.misc.Contended 用来解决这个问题，它的原理是在使用此注解的对象或字段的前后各增加 128 字节大小的 padding，从而让 CPU 将对象预读至缓存时占用不同的缓存行，这样，不会造成对方缓存行的失效。

累加add方法

    /**
     * Adds the given value.
     *
     * @param x the value to add
     */
    public void add(long x) {
         // as 为累加单元数组
         // b 为基础值
         // x 为累加值
        Cell[] as; long b, v; int m; Cell a;
         // 进入 if 的两个条件
         // 1. as 有值, 表示已经发生过竞争, 进入 if
         // 2. cas 给 base 累加时失败了, 表示 base 发生了竞争, 进入 if
        if ((as = cells) != null || !casBase(b = base, b + x)) {
            // uncontended 表示 cell 没有竞争
            boolean uncontended = true;
            if (
                // as 还没有创建
                as == null || (m = as.length - 1) < 0 ||
                // 当前线程对应的 cell 还没有
                (a = as[getProbe() & m]) == null ||
                // cas 给当前线程的 cell 累加失败，
                // uncontended=false ( a 为当前线程的 cell )
                !(uncontended = a.cas(v = a.value, v + x)))
                {
                // 进入 cell 数组创建、cell 创建的流程
                longAccumulate(x, null, uncontended);
            }
        }
    }

cell 数组创建、cell 创建的流程——longAccumulate方法

    /**
     处理涉及初始化、调整大小、创建新单元和或争用的更新案例。参见上面的解释。
     这种方法存在乐观重试代码的常见非模块化问题，依赖于重新检查的读取集
     *
     * @param x the value
     * @param fn the update function, or null for add (this convention
     * avoids the need for an extra field or function in LongAdder).
        更新函数，或 null 用于添加（此约定避免 LongAdder 中需要额外的字段或函数
     * @param wasUncontended false if CAS failed before call
        或者如果 CAS 在调用之前失败
     */
    final void longAccumulate(long x, LongBinaryOperator fn,
                              boolean wasUncontended) {
        int h;
        // 当前线程还没有对应的 cell, 需要随机生成一个 h 值用来将当前线程绑定到 cell
        if ((h = getProbe()) == 0) {
            // 初始化 probe
            ThreadLocalRandom.current(); // 强制初始化
            // h 对应新的 probe 值, 用来对应 cell
            h = getProbe();
            wasUncontended = true;
        }
        // collide 为 true 表示需要扩容
        boolean collide = false;                // 如果最后一个插槽非空则为真
        for (;;) {
            Cell[] as; Cell a; int n; long v;
                // cells存在
            if ((as = cells) != null && (n = as.length) > 0) {
                // cell不存在
                if ((a = as[(n - 1) & h]) == null) {
                    // 为 cellsBusy 加锁, 创建 cell, cell 的初始累加值为 x
                    // 成功则 break, 否则继续 continue 循环
                    if (cellsBusy == 0) {       // 尝试添加新的cell
                        Cell r = new Cell(x);   // 乐观创建
                        if (cellsBusy == 0 && casCellsBusy()) { // casCellsBusy(): CAS 将 cellsBusy 字段从 0 变为 1 以获取锁
                            boolean created = false;
                            try {               // 重新检查锁
                                Cell[] rs; int m, j;
                                if ((rs = cells) != null &&
                                    (m = rs.length) > 0 &&
                                    rs[j = (m - 1) & h] == null) {
                                    rs[j] = r;
                                    created = true;
                                }
                            } finally {
                                cellsBusy = 0;
                            }
                            if (created)
                                break;
                            continue;           // 插槽现在非空
                        }
                    }
                    collide = false;
                }
                // 有竞争, 改变线程对应的 cell 来重试 cas
                else if (!wasUncontended)       // CAS already known to fail
                    wasUncontended = true;      // Continue after rehash
                // cas 尝试累加, fn 配合 LongAccumulator 不为 null, 配合 LongAdder 为 null
                else if (a.cas(v = a.value, ((fn == null) ? v + x :
                                             fn.applyAsLong(v, x))))
                    break;
                // 如果 cells 长度已经超过了最大长度, 或者已经扩容, 改变线程对应的 cell 来重试 cas
                else if (n >= NCPU || cells != as)
                    collide = false;            // At max size or stale
                // 确保 collide 为 false 进入此分支, 就不会进入下面的 else if 进行扩容了
                else if (!collide)
                    collide = true;
                // 加锁
                else if (cellsBusy == 0 && casCellsBusy()) {
                    // 加锁成功, 扩容
                    try {
                        if (cells == as) {      // Expand table unless stale
                            Cell[] rs = new Cell[n << 1];
                            for (int i = 0; i < n; ++i)
                                rs[i] = as[i];
                            cells = rs;
                        }
                    } finally {
                        cellsBusy = 0;
                    }
                    collide = false;
                    continue;                   // Retry with expanded table
                }
                // 改变线程对应的 cell
                h = advanceProbe(h);
            }
            // 还没有 cells, 尝试给 cellsBusy 加锁
            else if (cellsBusy == 0 && cells == as && casCellsBusy()) {
                // 加锁成功, 初始化 cells, 最开始长度为 2, 并填充一个 cell
                // 成功则 break;
                boolean init = false;
                try {                           // Initialize table
                    if (cells == as) {
                        Cell[] rs = new Cell[2];
                        rs[h & 1] = new Cell(x);
                        cells = rs;
                        init = true;
                    }
                } finally {
                    cellsBusy = 0;
                }
                if (init)
                    break;
            }
            // 上两种情况失败, 尝试给 base 累加
            else if (casBase(v = base, ((fn == null) ? v + x :
                                        fn.applyAsLong(v, x))))
                break;                          // Fall back on using base
        }
    }

每个线程刚进入 longAccumulate 时，会尝试对应一个 cell 对象（找到一个坑位）

获取最终结果通过 sum 方法

返回当前总和。返回的值不是原子快照；
在没有并发更新的情况下调用会返回准确的结果，
但在计算总和时发生的并发更新可能不会被合并
public long sum() {
        Cell[] as = cells; Cell a;
        long sum = base;
        if (as != null) {
            for (int i = 0; i < as.length; ++i) {
                if ((a = as[i]) != null)
                    sum += a.value;
            }
        }
        return sum;
    }