dubbo超时机制源码浅析

环境

dubbo版本:2.7.8

超时配置

下图显示了timeout配置的查找顺序:

  • 方法级优先,接口级次之,全局配置再次之。
  • 如果级别一样,则消费方优先,提供方次之。

其中,服务提供方配置,通过URL经由注册中心传递给消费方。
超时设置优先级

源码分析

HashedWheelTimer

HashedWheel
HashedWheelTimer是一个计时器。主要的数据结构是一个HashedWheelBucket数组,类似于哈希表(下文称作wheel)。

HashedWheelTimer具体如何计时的,我的理解是:wheel分为多个bucket,顺序循环遍历,从一个bucket到另一个bucket经过固定的时间。每个bucket可放置多个定时任务。定时任务都有过期时间,根据过期时间可以计算出定时任务的剩余圈数和放置位置。遍历到某个bucket时,遍历其中的定时任务,若该任务剩余圈数<=0时,则需要执行该任务,否则剩余圈数-1。

下面是HashedWheelTimer.Worker的run方法的主要逻辑

private long tick;

@Override
public void run() {
	...  // 省略无关代码
	do {
	    final long deadline = waitForNextTick();  // 等待固定时间
	    if (deadline > 0) {
	        int idx = (int) (tick & mask);
	        processCancelledTasks();  // 从bucket中移除取消的timeout
	        HashedWheelBucket bucket = wheel[idx];
	        transferTimeoutsToBuckets();  // 把新的timeout放到bucket中
	        bucket.expireTimeouts(deadline);  // 从bucket中移除到期timeout
	        tick++;
	    }
	} while (WORKER_STATE_UPDATER.get(HashedWheelTimer.this) == WORKER_STATE_STARTED);
	...  // 省略无关代码
}

在dubbo源码中,有2个地方用到了HashedWheelTimer,分别在DefaultFuture类和HeaderExchangeClient类中,分别对应dubbo-future-timeout和dubbo-client-idleCheck这两个线程,当项目启动时这2个线程就开始运行了。dubbo-client-idleCheck负责重连provider的,这里暂不分析。dubbo-future-timeout负责RPC超时管理的。

RPC超时如何交给HashedWheelTimer管理

把HashedWheelTimeout放到wheel中

RPC过程会创建DefaultFuture对象,同时会调用DefaultFuture中的HashedWheelTimer的newTimeout方法

@Override
public Timeout newTimeout(TimerTask task, long delay, TimeUnit unit) {
    ...  // 省略一些空值抛异常代码

    long pendingTimeoutsCount = pendingTimeouts.incrementAndGet();

    if (maxPendingTimeouts > 0 && pendingTimeoutsCount > maxPendingTimeouts) {
        pendingTimeouts.decrementAndGet();
        throw new RejectedExecutionException("Number of pending timeouts ("
                + pendingTimeoutsCount + ") is greater than or equal to maximum allowed pending "
                + "timeouts (" + maxPendingTimeouts + ")");
    }

    start();  // 一个无关紧要的方法

    // Add the timeout to the timeout queue which will be processed on the next tick.
    // During processing all the queued HashedWheelTimeouts will be added to the correct HashedWheelBucket.
    long deadline = System.nanoTime() + unit.toNanos(delay) - startTime;

    // Guard against overflow.
    if (delay > 0 && deadline < 0) {
        deadline = Long.MAX_VALUE;
    }
    HashedWheelTimeout timeout = new HashedWheelTimeout(this, task, deadline);
    timeouts.add(timeout);
    return timeout;
}

newTimeout方法主要是创建了HashedWheelTimeout对象并添加到timeouts这个链表中,其中传入的task是DefaultFuture的TimeoutCheckTask。

然后DefaultFuture设置好了之后,开始RPC发送请求。
与此同时,在HashedWheelTimer的线程中,一直在执行HashedWheelTimer.Worker的run方法,在run方法的transferTimeoutsToBuckets方法中,把之前添加到timeouts链表的HashedWheelTimeout对象拿出来,计算好剩余圈数和放置的index后,添加到对应的bucket中,之后就成功交给HashedWheelTimer进行超时管理。

private void transferTimeoutsToBuckets() {
    // transfer only max. 100000 timeouts per tick to prevent a thread to stale the workerThread when it just
    // adds new timeouts in a loop.
    for (int i = 0; i < 100000; i++) {
        HashedWheelTimeout timeout = timeouts.poll();
        if (timeout == null) {
            // all processed
            break;
        }
        if (timeout.state() == HashedWheelTimeout.ST_CANCELLED) {
            // Was cancelled in the meantime.
            continue;
        }

        long calculated = timeout.deadline / tickDuration;  // 注意timeout.deadline是从HashedWheelTimer初始化时的startTime开始的,所以下面calculated要减去tick
        timeout.remainingRounds = (calculated - tick) / wheel.length;

        // Ensure we don't schedule for past.
        final long ticks = Math.max(calculated, tick);
        int stopIndex = (int) (ticks & mask);

        HashedWheelBucket bucket = wheel[stopIndex];
        bucket.addTimeout(timeout);
    }
}

判断超时后的处理

HashedWheelTimer.Worker的run方法包含了HashedWheelBucket的expireTimeouts方法

/**
 *Expire all HashedWheelTimeouts for the given deadline.
 */
void expireTimeouts(long deadline) {
    HashedWheelTimeout timeout = head;

    // process all timeouts
    while (timeout != null) {
        HashedWheelTimeout next = timeout.next;
        if (timeout.remainingRounds <= 0) {
            next = remove(timeout);
            if (timeout.deadline <= deadline) {  // 到期了
                timeout.expire();
            } else {
                // The timeout was placed into a wrong slot. This should never happen.
                throw new IllegalStateException(String.format(
                        "timeout.deadline (%d) > deadline (%d)", timeout.deadline, deadline));
            }
        } else if (timeout.isCancelled()) {
            next = remove(timeout);
        } else {
            timeout.remainingRounds--;
        }
        timeout = next;
    }
}

其中如果timeout判断为到期,则运行HashedWheelTimeout的expire方法

public void expire() {
    if (!compareAndSetState(ST_INIT, ST_EXPIRED)) {
        return;
    }

    try {
        task.run(this);
    } catch (Throwable t) {
        if (logger.isWarnEnabled()) {
            logger.warn("An exception was thrown by " + TimerTask.class.getSimpleName() + '.', t);
        }
    }
}

就是把之前HashedWheelTimeout创建时,传入的task的run方法执行,这个task也就是上面说到的DefaultFuture的TimeoutCheckTask。下面是TimeoutCheckTask的run方法

@Override
public void run(Timeout timeout) {
    DefaultFuture future = DefaultFuture.getFuture(requestID);
    if (future == null || future.isDone()) {  // 判断future中有无数据;或负责RPC的线程运行再快一点,future早已为null
        return;
    }

    if (future.getExecutor() != null) {
        future.getExecutor().execute(() -> notifyTimeout(future));
    } else {
        notifyTimeout(future);
    }
}

private void notifyTimeout(DefaultFuture future) {
    // create exception response.
    Response timeoutResponse = new Response(future.getId());
    // set timeout status.
    timeoutResponse.setStatus(future.isSent() ? Response.SERVER_TIMEOUT : Response.CLIENT_TIMEOUT);
    timeoutResponse.setErrorMessage(future.getTimeoutMessage(true));
    // handle response.
    DefaultFuture.received(future.getChannel(), timeoutResponse, true);
}

在future.getTimeoutMessage方法中,设置了RPC超时错误信息Waiting server-side response timeout...Sending request timeout in client-side...,之后就报org.apache.dubbo.remoting.TimeoutException了。

 

思考

数值溢出问题

下面是HashedWheelTimer的waitForNextTick方法

/**
 * calculate goal nanoTime from startTime and current tick number,
 * then wait until that goal has been reached.
 *
 * @return Long.MIN_VALUE if received a shutdown request,
 * current time otherwise (with Long.MIN_VALUE changed by +1)
 */
private long waitForNextTick() {
    long deadline = tickDuration * (tick + 1);

    for (; ; ) {
        final long currentTime = System.nanoTime() - startTime;
        long sleepTimeMs = (deadline - currentTime + 999999) / 1000000;

        if (sleepTimeMs <= 0) {
            if (currentTime == Long.MIN_VALUE) {
                return -Long.MAX_VALUE;
            } else {
                return currentTime;
            }
        }
        if (isWindows()) {
            sleepTimeMs = sleepTimeMs / 10 * 10;
        }

        try {
            Thread.sleep(sleepTimeMs);
        } catch (InterruptedException ignored) {
            if (WORKER_STATE_UPDATER.get(HashedWheelTimer.this) == WORKER_STATE_SHUTDOWN) {
                return Long.MIN_VALUE;
            }
        }
    }
}

上面的deadline,tick和currentTime都是long类型,且都在不断自增肯定会遇到溢出问题,那么溢出是否会对计时造成影响呢?
在dubbo的HashedWheelTimer中,tickDuration默认值为30000000。下表为tick与该tick计算出的deadline的对应关系

tickdeadline
3074457345609223372036830000000
307445734561-9223372036849551616
614891469122-19551616
61489146912310448384
92233720368547758070
-922337203685477580830000000

可以知道deadline之间还是保持着tickDuration的固定时间间隔。

再举个极端点的例子:计算的deadline溢出变为负数,但是currentTime = System.nanoTime() - startTime;得到正数。其实最后sleepTimeMs = (deadline - currentTime + 999999) / 1000000;计算出的sleepTimeMs还是一个比tickDuration小的非负数。

结论是:数值溢出确实让人思考比较绕,但是它不影响计算出的等待时间。

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值