Sentinel资源调用链之StatisticSlot数据统计

最新推荐文章于 2024-03-07 09:55:05 发布

麻袋海鸥

最新推荐文章于 2024-03-07 09:55:05 发布

阅读量675

点赞数

文章标签： java spring

本文链接：https://blog.csdn.net/qq_38952877/article/details/105695704

版权

前言

Sentinel 处理流程是基于slot链(ProcessorSlotChain)来完成的，比如限流、熔断等，其中重要的一个slot就是StatisticSlot，它是做各种数据统计的，而限流/熔断的数据判断来源就是StatisticSlot，StatisticSlot的各种数据统计都是基于滑动窗口来完成的，因此本文会结合源码一步步分析StatisticSlot中滑动窗口的实现原理。

一 StatisticSlot数据采集的entry方法源码分析

public void entry(Context context, ResourceWrapper resourceWrapper, DefaultNode node, int count,
                  boolean prioritized, Object... args) throws Throwable {
    try {
        // Do some checking.
        //next（下一个）节点调用Entry方法
        fireEntry(context, resourceWrapper, node, count, prioritized, args);
        // 如果能通过SlotChain中后面的Slot的entry方法，说明没有被限流或降级
        // Request passed, add thread count and pass count.
        //当前线程数加1
        node.increaseThreadNum();
        //通过的请求加上count
        node.addPassRequest(count); //@1
        // 元节点通过请求数和当前线程（LongAdder curThreadNum）计数器加1
        if (context.getCurEntry().getOriginNode() != null) {
            // Add count for origin node.
            context.getCurEntry().getOriginNode().increaseThreadNum();
          context.getCurEntry().getOriginNode().addPassRequest(count);
        }
        // 入口节点通过请求数和当前线程（LongAdder curThreadNum）计数器加1
        if (resourceWrapper.getEntryType() == EntryType.IN) {
            // Add count for global inbound entry node for global statistics.
            Constants.ENTRY_NODE.increaseThreadNum();
            Constants.ENTRY_NODE.addPassRequest(count);
        }
        // Handle pass event with registered entry callback handlers. 注册的扩展点的数据统计
        for (ProcessorSlotEntryCallback<DefaultNode> handler : StatisticSlotCallbackRegistry.getEntryCallbacks()) {
            handler.onPass(context, resourceWrapper, node, count, args);
        }
    } catch (PriorityWaitException ex) {
        node.increaseThreadNum();
        if (context.getCurEntry().getOriginNode() != null) {
            // Add count for origin node.
            context.getCurEntry().getOriginNode().increaseThreadNum();
        }

        if (resourceWrapper.getEntryType() == EntryType.IN) {
            // Add count for global inbound entry node for global statistics.
            Constants.ENTRY_NODE.increaseThreadNum();
        }
        // Handle pass event with registered entry callback handlers.
        for (ProcessorSlotEntryCallback<DefaultNode> handler : StatisticSlotCallbackRegistry.getEntryCallbacks()) {
            handler.onPass(context, resourceWrapper, node, count, args);
        }
    } catch (BlockException e) {
        // Blocked, set block exception to current entry.
        context.getCurEntry().setError(e);
        // Add block count.
        node.increaseBlockQps(count); //@2
        if (context.getCurEntry().getOriginNode() != null) {     context.getCurEntry().getOriginNode().increaseBlockQps(count);
        }
        if (resourceWrapper.getEntryType() == EntryType.IN) {
            // Add count for global inbound entry node for global statistics.
            Constants.ENTRY_NODE.increaseBlockQps(count);
        }
        // Handle block event with registered entry callback handlers.
        for (ProcessorSlotEntryCallback<DefaultNode> handler : StatisticSlotCallbackRegistry.getEntryCallbacks()) {
            handler.onBlocked(e, context, resourceWrapper, node, count, args);
        }
        throw e;
    } catch (Throwable e) {
        // Unexpected error, set error to current entry.
        context.getCurEntry().setError(e);
        // This should not happen.
        node.increaseExceptionQps(count); //@3
        if (context.getCurEntry().getOriginNode() != null) {
            context.getCurEntry().getOriginNode().increaseExceptionQps(count);
        }
        if (resourceWrapper.getEntryType() == EntryType.IN) {
            Constants.ENTRY_NODE.increaseExceptionQps(count);
        }
        throw e;
    }
}

由前面文章可知责任链的调用模式是以entry方法为入口在entry方法中处理完功能逻辑后调用fireEntry方法指向下一个节点的entry方法。而 StatisticSlot的fireEntry方法调用顺序前置，这样做的目的是先进行规则验证，如果规则验证不通过则进入相应的catch异常统计异常数据，规则验证通过统计统计成功数据。
源码中的 @1，@2，@3 方法都是由StatisticNode中的两个关键属性实现的

/**
 *默认采样数为2 采样间隔为1000 （0~499 500~999两个窗口）
 * Holds statistics of the recent {@code INTERVAL} seconds. The {@code INTERVAL} is divided into time spans
 * by given {@code sampleCount}. 默认窗口数为2 采样间隔为1000 0~499 500~999
 */
private transient volatile Metric rollingCounterInSecond = new ArrayMetric(SampleCountProperty.SAMPLE_COUNT,
    IntervalProperty.INTERVAL);

/** 
 * 默认采样数为60 采样间隔为60*1000 （0~1000 1000~2000 ... 共60个窗口）
 * Holds statistics of the recent 60 seconds. The windowLengthInMs is deliberately set to 1000 milliseconds,
 * meaning each bucket per second, in this way we can get accurate statistics of each second.
 */
private transient Metric rollingCounterInMinute = new ArrayMetric(60, 60 * 1000, false);

我们可以认为rollingCounterInSecond 和rollingCounterInMinute 分别是秒级滚动计数器和分级滚动计数器。
二分析滚动计数器
以秒级滚动计数器来看

private transient volatile Metric rollingCounterInSecond = new ArrayMetric(SampleCountProperty.SAMPLE_COUNT,
    IntervalProperty.INTERVAL);

由代码看rollingCounterInSecond 是ArrayMetric的实例化对象。

private final LeapArray<MetricBucket> data;

public ArrayMetric(int sampleCount, int intervalInMs) {
    this.data = new OccupiableBucketLeapArray(sampleCount, intervalInMs);
}

数据统计容器为LeapArray，而数据以MetricBucket实列为载体

1.LeapArray 属性

protected int windowLengthInMs; // 窗口长度
protected int sampleCount; // 样品数量
protected int intervalInMs; // 间期

// 采样的时间窗口数组
protected final AtomicReferenceArray<WindowWrap<T>> array;

2.leapArray构造器

```java
public LeapArray(int sampleCount, int intervalInMs) {
    AssertUtil.isTrue(sampleCount > 0, "bucket count is invalid: " + sampleCount);
    AssertUtil.isTrue(intervalInMs > 0, "total time interval of the sliding window should be positive");
    AssertUtil.isTrue(intervalInMs % sampleCount == 0, "time span needs to be evenly divided");

    this.windowLengthInMs = intervalInMs / sampleCount;
    this.intervalInMs = intervalInMs;
    this.sampleCount = sampleCount;

    this.array = new AtomicReferenceArray<>(sampleCount);
}

代码看到这里总结一下
rollingCounterInSecond 秒级滚动计数器实际是以一个大小为sampleCount的AtomicReferenceArray容器存放WindowWrap数据进行数据统计（WindowWrap实际是MetricBucket包装类）

那么我们简单看一下这个WindowWrap包装类(窗口包装类)

/**
 * Time length of a single window bucket in milliseconds.
 */
private final long windowLengthInMs;

/**
 * Start timestamp of the window in milliseconds.
 */
private long windowStart;

/**
 * Statistic data. 默认MetricBucket
 */
private T value;

/**
 * @param windowLengthInMs a single window bucket's time length in milliseconds.
 * @param windowStart      the start timestamp of the window
 * @param value            statistic data
 */
public WindowWrap(long windowLengthInMs, long windowStart, T value) {
    this.windowLengthInMs = windowLengthInMs;
    this.windowStart = windowStart;
    this.value = value;
}

我们可以看出这是一个包装类，这里的 T value 我们可以认为是MetricBucket
结合leapArray的关键属性我们可知rollingCounterInSecond 采用滑动窗口的方式计数。

三滑动窗口计数原理分析

以addPass(count)为例数据统计调用关系图
在这里插入图片描述
前面已经分析了rollingCouterInSecond是LeapArray data 机型数据统计

WindowWrap<MetricBucket> wrap = data.currentWindow(); //@1
public WindowWrap<T> currentWindow() {
    // 设置当前时间窗口到窗口列表
    return currentWindow(TimeUtil.currentTimeMillis()); //@2
}
public WindowWrap<T> currentWindow(long timeMillis) {
    if (timeMillis < 0) {
        return null;
    }
    // 判读当前时间属于哪个窗口
    int idx = calculateTimeIdx(timeMillis);  //@3
    // Calculate current bucket start time. 计算当前窗口开始时间
    long windowStart = calculateWindowStart(timeMillis); //@4

    /*
     * Get bucket item at given time from the array.
     *
     * (1) Bucket is absent, then just create a new bucket and CAS update to circular array.
     * (2) Bucket is up-to-date, then just return the bucket.
     * (3) Bucket is deprecated, then reset current bucket and clean all deprecated buckets.
     */
    while (true) {
        // 获取数组中的老数据
        WindowWrap<T> old = array.get(idx);// @5
        if (old == null) {
            /*
             *     B0       B1      B2    NULL      B4
             * ||_______|_______|_______|_______|_______||___
             * 200     400     600     800     1000    1200  timestamp
             *                             ^
             *                          time=888
             *            bucket is empty, so create new and update
             *
             * If the old bucket is absent, then we create a new bucket at {@code windowStart},
             * then try to update circular array via a CAS operation. Only one thread can
             * succeed to update, while other threads yield its time slice.
             */
            WindowWrap<T> window = new WindowWrap<T>(windowLengthInMs, windowStart, newEmptyBucket(timeMillis));
            // 通过cas判断
            if (array.compareAndSet(idx, null, window)) {
                // Successfully updated, return the created bucket.
                return window;
            } else {
                // Contention failed, the thread will yield its time slice to wait for bucket available.
                Thread.yield();
            }
            // 如果对应时间窗口的开始时间与计算得到的开始时间一样
            // 那么代表当前即是我们要找的窗口对象，直接返回
        } else if (windowStart == old.windowStart()) {
            /*
             *     B0       B1      B2     B3      B4
             * ||_______|_______|_______|_______|_______||___
             * 200     400     600     800     1000    1200  timestamp
             *                             ^
             *                          time=888
             *            startTime of Bucket 3: 800, so it's up-to-date
             *
             * If current {@code windowStart} is equal to the start timestamp of old bucket,
             * that means the time is within the bucket, so directly return the bucket.
             */
            return old;
        } else if (windowStart > old.windowStart()) {
            /*
             *   (old)
             *             B0       B1      B2    NULL      B4
             * |_______||_______|_______|_______|_______|_______||___
             * ...    1200     1400    1600    1800    2000    2200  timestamp
             *                              ^
             *                           time=1676
             *          startTime of Bucket 2: 400, deprecated, should be reset
             *
             * If the start timestamp of old bucket is behind provided time, that means
             * the bucket is deprecated. We have to reset the bucket to current {@code windowStart}.
             * Note that the reset and clean-up operations are hard to be atomic,
             * so we need a update lock to guarantee the correctness of bucket update.
             *
             * The update lock is conditional (tiny scope) and will take effect only when
             * bucket is deprecated, so in most cases it won't lead to performance loss.
             */
            if (updateLock.tryLock()) {
                try {
                    //如果当前的开始时间大于原开始时间，那么就更新到新的开始时间
                    // Successfully get the update lock, now we reset the bucket.
                    return resetWindowTo(old, windowStart);//@6
                } finally {
                    updateLock.unlock();
                }
            } else {
                // Contention failed, the thread will yield its time slice to wait for bucket available.
                Thread.yield();
            }
        } else if (windowStart < old.windowStart()) {
            // Should not go through here, as the provided time is already behind.
            return new WindowWrap<T>(windowLengthInMs, windowStart, newEmptyBucket(timeMillis));
        }
    }
}

@1 根据获取leapArray中存放的窗口数据
@2 获取当前窗口数据
@3 根据当前时间计算当前时间所属窗口位置

/**
 * calculateTimeIdx方法用当前的时间戳除以每个窗口的大小，
 * windowLengthInMs = intervalInMs/sampleCount
 * 再和array数据取模。array数据是一个容量为60的数组，
 * 代表被统计的60秒分割的60个小窗口。
 * @param timeMillis
 * @return
 */
private int calculateTimeIdx(/*@Valid*/ long timeMillis) {
    long timeId = timeMillis / windowLengthInMs;
    // Calculate current index so we can map the timestamp to the leap array.
    return (int)(timeId % array.length());
}

@4 计算窗口实际开始时间

/**
 * 当前时间减去（当前时间根据窗口长度取模的值）
 * @param timeMillis
 * @return
 */
protected long calculateWindowStart(/*@Valid*/ long timeMillis) {
    return timeMillis - timeMillis % windowLengthInMs;
}

@5 根据窗口位置获取窗口数据
该位置不存在窗口则新建窗口通过cas放入窗口集合中返回一个新窗口
@6 当前窗口开始时间大于旧窗口开始时间窗口向前滑动

@Override
protected WindowWrap<MetricBucket> resetWindowTo(WindowWrap<MetricBucket> w, long startTime) {
    // Update the start time and reset value.
    w.resetTo(startTime);
    w.value().reset();
    return w;
}

麻袋海鸥

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Sentinel资源调用链之StatisticSlot数据统计

前言Sentinel 处理流程是基于slot链(ProcessorSlotChain)来完成的，比如限流、熔断等，其中重要的一个slot就是StatisticSlot，它是做各种数据统计的，而限流/熔断的数据判断来源就是StatisticSlot，StatisticSlot的各种数据统计都是基于滑动窗口来完成的，因此本文会结合源码一步步分析StatisticSlot中滑动窗口的实现原理。一 ...
复制链接

扫一扫