高并发流量限制-计数器&漏桶&令牌桶_分布式下令牌桶控制消费-CSDN博客

public class SpeedCounter {  // 计速器

    // 起始时间
    private static long startTime = System.currentTimeMillis();
    // 时间间隔 ms
    private static long interval = 10;
    // 每秒限制数量
    private static long maxCount = 1000;
    //
    private static AtomicLong nowCount = new AtomicLong();

    // 计数判断
    private static long isAccess(int taskId, int nth) {
        long nowTime = System.currentTimeMillis();
        if (nowTime < startTime + interval) {
            long oldValue;
            long newValue;
            do {
                oldValue = nowCount.get();
                newValue = oldValue + 1;
            } while (!nowCount.compareAndSet(oldValue, oldValue + 1));
            
            if (newValue <= maxCount) {
                return newValue;
            } else {
                return -newValue;
            }
        } else {
            synchronized (SpeedCounter.class) {
                System.out.println("waiting in ........................"  +taskId + ", nth: " + nth);
                if (nowTime > startTime + interval) { // 双重检验 防止重复初始化
                    System.out.println("================init start .================= " + taskId + ", nth: " + nth);
                    nowCount.set(0);
                    startTime = nowTime;
                }
            }
            return 0;
        }
    }
    public static void main(String[] args) {
        for (int i = 1; i < 10; i++) {
            task(i);
        }
    }

    public static void task(final int taskId) {
        new Thread(new Runnable() {
            public void run() {
                try {
                    for (int i = 1; i <= 100; i++) {
                        long cnt;
                        if ((cnt = isAccess(taskId, i)) > 0) {
                            System.out.println("业务" + taskId + "顺利执行" + cnt);
                        } else {
                            System.out.println("业务" + taskId + "被丢弃" + cnt);
                        }
                    }
                    Thread.sleep(1000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        }).start();
    }

}

👉 漏桶demo:

public class LeackBucket { // 漏桶
    // 起始时间
    private static long startTime = System.currentTimeMillis();
    // 流出速率 /ms
    private static long speed = 20;
    // 桶的容量
    private static long maxCount = 100;
    //
    private static long nowCount = 0;

    public static boolean isAccess() {
        long nowTime = System.currentTimeMillis();
        long outCount = (nowTime - startTime) * speed; // 流出数量
        startTime = nowTime; // 更新

        nowCount = nowCount - outCount <= 0 ? 0 : nowCount - outCount;
        if (nowCount < maxCount) {
            nowCount++;
            return true;
        } else {
            return false;
        }
    }

    public static void main(String[] args) {
        for (int i = 0; i < 500; i++) {
            if (isAccess()) {
                System.out.println("业务顺利执行...");
            } else {
                System.out.println("业务被丢弃---");
            }
        }
    }
}

漏桶和令牌桶的主要区别在于处理速度上，漏桶是恒定速度处理请求，而令牌桶是根据令牌是否有剩余来决定是否处理请求，也可以说生产速度决定处理速度。

👉 令牌桶限流在生产上十分常见，接下来我们将以开源的一些实现进行分析～

二、开源令牌桶实现

如果是单机限流需求，首推 Guava 的 RateLimiter，我们将以此展开分析...

RateLimiter 的内部实现基本单位是微秒，也就是将 1s 的令牌量转化为多少微秒生产一个令牌，比如 1s 可以生产 1000 个令牌, 那就是 1ms(1000微秒)生产一个令牌。

假设前 1ms 内有 1000 个线程请求令牌，仅有一个能拿到令牌，其他都得排队等待生产了才能处理。而如果一开始生产了 1000 个令牌，则 1000 个线程都可以同时拿到令牌，这就是令牌桶的一大特点：非恒速消费

1、动手试试？

这里先介绍Ratelimiter的使用，然后跟着源码深入理解

👉 例1：

    private final static ExecutorService executor = Executors.newFixedThreadPool(10);

    @Test
    public void testAcquire() {
        RateLimiter rateLimiter = RateLimiter.create(10);
        try {
            Thread.sleep(100);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }

        for (int i = 0; i < 50; i++) {
            doProcess(() -> {
                double acquire = rateLimiter.acquire();
                System.out.printf("Acquire success and wait: %s(s)\n", acquire);
            });
        }
    }
    
    private static void doProcess(Runnable task) {
        Future<?> future = executor.submit(task);
        try {
            if (null != future.get()) {
                System.out.println(future.get());
            }
        } catch (InterruptedException e) {
            e.printStackTrace();
        } catch (ExecutionException e) {
            e.printStackTrace();
        }
    }

创建每秒处理 10 个请求的 ratelimiter，让线程 sleep 100ms 是为了有一定的时间创建令牌，看看结果可以观察到刚开始有令牌可以直接拿到，后面则需要阻塞一定时间才能拿到：

Ratelimiter 有两类主要获取令牌的方法：

一是acquire() 会阻塞直到拿到令牌，
另外是 tryAcquire() 可以设置超时时间，超过指定时间没有拿到令牌则丢弃。

我们接着再看看 tryAcquire() ...

👉 例2：

    @Test
    public void testTryAcquirePermits() {
        RateLimiter rateLimiter = RateLimiter.create(10);
        for (int i = 0; i < 50; i++) {
            doProcess(() -> {
                if (rateLimiter.tryAcquire(90, TimeUnit.MILLISECONDS)) {
                    System.out.println("Try acquire success");
                } else {
                    System.out.println("Try acquire failure and discard request");
                }
            });
        }
    }

设置超时时间90毫秒，超过则直接丢弃

另外Ratelimiter这两个方法都可以设置一次性可以拿n个令牌这个参数，

👉 例3：

    @Test
    public void testAcquirePermits() {
        RateLimiter rateLimiter = RateLimiter.create(10);
        for (int i = 0; i < 50; i++) {
            doProcess(() -> {
                double acquire = rateLimiter.acquire(12);
                System.out.printf("Acquire success and wait: %s(s)\n", acquire);
            });
        }
    }

设置的每秒 10 个令牌, 一次性拿 12 个令牌，可以看到结果第一次无须等待直接就能拿到 12 个令牌，为什么？

这就是后面会介绍的提前消费让后面的请求“买单”(也就是要花费相应的等待时间)，也正是这样让 RateLimiter 可以处理一些请求猛增的突发场景(一次性拿一个令牌其实处理不了)

Ratelimiter 有两个实现子类，上面默认创建的是 SmoothBursty 的实例，它是一种以恒定速度生产的一种方式

下面看看 SmoothWarmingUp，是一种需要设置“预热期”的方式，在“预热期”内，生产速度较慢，到稳定之后则是恒定生产速度。

可类比应该刚上线，响应请求较慢，一段时间稳定即正常了。

👉 例4：

    @Test
    public void testAcquireWithSmoothWarmingUp() {
        RateLimiter rateLimiter = RateLimiter.create(10, 5, TimeUnit.SECONDS);
        for (int i = 0; i < 50; i++) {
            doProcess(() -> {
                double acquire = rateLimiter.acquire(10);
                System.out.printf("Acquire success and wait: %s(s), current: %s\n",
                        acquire, System.currentTimeMillis() / 1000);
            });
        }
    }

设置每秒10个令牌，5秒“预热”时间，每次获取10个令牌，从结果可以看到5秒“预热”期内等待时间都是大于1秒并逐渐减小, 直到预热期之后达到稳定：

以上便是 Ratelimiter 的基本用法

2、原理分析：

1、SmoothBursty:

👉 首先看阻塞方法acquire():

  @CanIgnoreReturnValue
  public double acquire() {
    return acquire(1);
  }

  @CanIgnoreReturnValue
  public double acquire(int permits) {
    long microsToWait = reserve(permits);
    stopwatch.sleepMicrosUninterruptibly(microsToWait);
    return 1.0 * microsToWait / SECONDS.toMicros(1L);
  }

  final long reserve(int permits) {
    checkPermits(permits);
    synchronized (mutex()) {
      return reserveAndGetWaitLength(permits, stopwatch.readMicros());
    }
  }

  final long reserveAndGetWaitLength(int permits, long nowMicros) {
    long momentAvailable = reserveEarliestAvailable(permits, nowMicros);
    return max(momentAvailable - nowMicros, 0);
  }

/**
   * Reserves the requested number of permits and returns the time that those permits can be used
   * (with one caveat).
   *
   * @return the time that the permits may be used, or, if the permits may be used immediately, an
   *     arbitrary past or present time
   */
  abstract long reserveEarliestAvailable(int permits, long nowMicros);

可以看到一个主要的方法reserve(permits) 返回一个需要等待的微秒数, 在reserve方法中锁定了mutex()，mutex() 字面意思看是互斥量，也就是对象锁，保证线程安全性。

reserve中通过reserveAndGetWaitLength来返回需要等待的时间, 真正计算等待时间的方法是由子类SmoothRateLimiter实现的reserveEarliestAvailable方法：

  @Override
  final long reserveEarliestAvailable(int requiredPermits, long nowMicros) {
    resync(nowMicros);
    long returnValue = nextFreeTicketMicros;
    double storedPermitsToSpend = min(requiredPermits, this.storedPermits);
    // freshPremits这个字段就是存量不足，需要多少个新的令牌
    double freshPermits = requiredPermits - storedPermitsToSpend;
    // 等待时间, SmoothBursty的storedPermitsToWaitTime恒为0, 也就保证了等待时间仅由freshPermits决定, 爆发性增长也在此处体现, 只要存量足够 直接拿走 无须等待
    long waitMicros =
        storedPermitsToWaitTime(this.storedPermits, storedPermitsToSpend)
            + (long) (freshPermits * stableIntervalMicros);

    this.nextFreeTicketMicros = LongMath.saturatedAdd(nextFreeTicketMicros, waitMicros);
    this.storedPermits -= storedPermitsToSpend;
    return returnValue;
  }

    /** Updates {@code storedPermits} and {@code nextFreeTicketMicros} based on the current time. */
  void resync(long nowMicros) {
    // if nextFreeTicket is in the past, resync to now
    // nextFreeTicketMicros 这个字段表示下一个可以获的令牌的时间点
    if (nowMicros > nextFreeTicketMicros) {
      // 试想当前1s已经过去一部分，滑动窗口已经往前挪走一步，所以需要把新的可用令牌计算进来
      // SmoothBurty因为是恒定速度生产，用时间长度除以生产速度即可得出
      // 至于SmoothWarmingUp后面介绍
      double newPermits = (nowMicros - nextFreeTicketMicros) / coolDownIntervalMicros();
      storedPermits = min(maxPermits, storedPermits + newPermits);
      nextFreeTicketMicros = nowMicros;
    }
  }

好，现在进入计算等待时间的具体实现上，看第一个方法resync(nowMicros), 这是一个很关键的方法, 从字面意思上看是重新同步，也的确是这样，它会更新整个时间窗口的信息(令牌存量、nextFreeTicketMicos)。

可以这样看，这是一条长长的时间轴，nextFreeTicketMicos是这条线上的标志时间点，每次请求过来的时候用看在这个点的前面还是后面，如果在前面了说明窗口信息过时了需要更新，反之则不用。

更新时 coolDownIntervalMicros 方法由最终的子类实现，当前讨论的是 SmoothBurty，看看其实现：

    @Override
    double coolDownIntervalMicros() {
      // 稳定的时间间隔的微秒数(两个令牌之间) -- 生产速度
      return stableIntervalMicros;
    }

  // 这一步是在create的时候设置的rate, 这里会计算stableIntervalMicros
  @Override
  final void doSetRate(double permitsPerSecond, long nowMicros) {
    resync(nowMicros);
    double stableIntervalMicros = SECONDS.toMicros(1L) / permitsPerSecond;
    this.stableIntervalMicros = stableIntervalMicros;
    doSetRate(permitsPerSecond, stableIntervalMicros);
  }

好，回到 reserveEarliestAvailable 继续往下走，接下来就是判断存量和请求量的关系，如果大于直接取走不用等待，小于则要计算等待时间 waitMicros = storedPermitsToWaitTime(this.storedPermits, storedPermitsToSpend)
+ (long) (freshPermits * stableIntervalMicros)。

对于 SmoothBursty 的 storedPermitsToWaitTime 恒为 0，用 freshPermits * 生产速度即为等待时间，然后将 nextFreeTicketMicros 时间点往前挪, 再留意一下 returnValue, 便可发现这步操作实现了让后面的请求“买单”，因为 nextFreeTicketMicros 标志点已经往前挪了，而返回的并非最新的。

👉 tryAcquire(), 设置超时时间，非阻塞：

  public boolean tryAcquire(int permits, long timeout, TimeUnit unit) {
    long timeoutMicros = max(unit.toMicros(timeout), 0);
    checkPermits(permits);
    long microsToWait;
    synchronized (mutex()) {
      long nowMicros = stopwatch.readMicros();
      if (!canAcquire(nowMicros, timeoutMicros)) {
        return false;
      } else {
        microsToWait = reserveAndGetWaitLength(permits, nowMicros);
      }
    }
    stopwatch.sleepMicrosUninterruptibly(microsToWait);
    return true;
  }

  private boolean canAcquire(long nowMicros, long timeoutMicros) {
    return queryEarliestAvailable(nowMicros) - timeoutMicros <= nowMicros;
  }

  @Override
  final long queryEarliestAvailable(long nowMicros) {
    return nextFreeTicketMicros;
  }

2、SmoothWarmingUp: 有预热期

 This implements the following function where coldInterval = coldFactor * stableInterval.

 <pre>
          ^ throttling
          |
    cold  +                  /
 interval |                 /.
          |                / .
          |               /  .   ← "warmup period" is the area of the trapezoid between
          |              /   .     thresholdPermits and maxPermits
          |             /    .
          |            /     .
          |           /      .
   stable +----------/  WARM .
 interval |          .   UP  .
          |          . PERIOD.
          |          .       .
        0 +----------+-------+--------------→ storedPermits
          0 thresholdPermits maxPermits
 </pre>

对于SmoothWarmingUp理解了这张图，就能理解其实现原理，这几个变量需要关注：

纵坐标表示时间，可以理解为生产一个令牌需要的时间，刚开始生产速度最慢为cold interval
横坐标表示令牌量
用一条垂线做游标，使用时游标向左移动，不使用时游标向右移动
stableInterval: 稳定时生产速度，比如我们设置1s限制10个令牌，则stableInterval = 100ms, 也就100ms生产一个令牌
coldInterval: 硬编码3*stableInterval, 从coldInterval到stableInterval是“预热”阶段，用条垂线来做游标，此时游标是往左边移动
warm up period: 是我们自己设置的“预热”时间，也就是梯形的面积
maxPermits: 最大令牌量，不使用时最大；由我们设置的每秒请求量和预热时间两个参数决定
thresholdPermits: 生产一个令牌需要的时间coolInterval降到stableInterval时的这个点
矩形面积等于梯形面积的一半，注意这里面积就是时间，为啥是一半？

文档说和 coldFactor 因子等于3有关，个人理解，假设现在游标移动到了矩形上，有一小段时间没有请求，此时游标向右边移动，因为矩形有一段距离下一次请求来的时候不至于走到梯形“预热”阶段。

而如果矩形面积偏大，很长一段时间没有请求，下一次请求的时候可能仍然在矩形阶段(而系统此时希望是在预热阶段)；所以矩形和梯形面积达到一定比例才合适。

代码上逻辑上基本一致，主要区别在于子类实现的一些细节上：

  @Override
  final long reserveEarliestAvailable(int requiredPermits, long nowMicros) {
    resync(nowMicros);
    long returnValue = nextFreeTicketMicros;
    // 要从"存量"中取出的令牌量
    double storedPermitsToSpend = min(requiredPermits, this.storedPermits);
    // “存量”不足时
    double freshPermits = requiredPermits - storedPermitsToSpend;
    long waitMicros =
        storedPermitsToWaitTime(this.storedPermits, storedPermitsToSpend)
            + (long) (freshPermits * stableIntervalMicros);
    // 将游标往前挪
    this.nextFreeTicketMicros = LongMath.saturatedAdd(nextFreeTicketMicros, waitMicros);
    this.storedPermits -= storedPermitsToSpend;
    return returnValue;
  }

    // 对于SmoothWarmingUp来说，横坐标 meaxPermits -> 0 期间等待时间控制在此处
      @Override
    long storedPermitsToWaitTime(double storedPermits, double permitsToTake) {
      double availablePermitsAboveThreshold = storedPermits - thresholdPermits;
      long micros = 0;
      // measuring the integral on the right part of the function (the climbing line)
      if (availablePermitsAboveThreshold > 0.0) {
        double permitsAboveThresholdToTake = min(availablePermitsAboveThreshold, permitsToTake);
        // TODO(cpovirk): Figure out a good name for this variable.
        double length =
            permitsToTime(availablePermitsAboveThreshold)
                + permitsToTime(availablePermitsAboveThreshold - permitsAboveThresholdToTake);
        // 实际就是计算梯形的面积
        micros = (long) (permitsAboveThresholdToTake * length / 2.0);
        permitsToTake -= permitsAboveThresholdToTake;
      }
      // measuring the integral on the left part of the function (the horizontal line)
      // 矩形面积
      micros += (long) (stableIntervalMicros * permitsToTake);
      return micros;
    }

直接看这条：long waitMicros =
storedPermitsToWaitTime(this.storedPermits, storedPermitsToSpend)
+ (long) (freshPermits * stableIntervalMicros);

✅ storedPermitsToWaitTime()，这里有三种情况：图中的面积就是这个方法来计算出的等待时间

✅ (long) (freshPermits * stableIntervalMicros) 这个是用来计算当游标移动到小于0之后的等待时间，其实这里和图3矩形的求法相同

✅ 还有一种情况：

RateLimiter小结：

主要步骤有两个，一是刷新滑动时间窗口更新数据，二是计算等待时间，SmoothBurty取决于stableIntervalMicros和存量, SmoothWarmingUp取决于上图的移动
SmoothBurty可以处理一些请求爆发增长的情况, 因为它可以缓存最多1s的令牌量, 试想一个装满令牌的令牌桶，当请求过来时可以一次性都给出去，而SmoothWarmingUp则不会缓存
SmoothBurty有真正意义上的存量，可以直接拿去用，最多为maxPermits，无须等待；而SmoothWarmingUp没有真正意义上的存量，从maxPermits -> 0只是用来计算等待时间的标准
RateLimiter有后一次请求为前一次请求“买单”的特性，因为每次返回的nextFreeTicketMicros不是最新的，每次的消耗量(等待时间) 都会更新到nextFreeTicketMicros，让下一次处理