Ribbon问题记录:高版本的rxjava导致Ribbon状态维护异常

问题描述:

  Ribbon的权重响应时间策略(WeightedResponseTimeRule),是根据服务响应时间分配权重,响应时间越长,权重越小,被选 中的可能性越低。
WeightedResponseTimeRule的choose方法的关键代码如下:
// last one in the list is the sum of all weights
double maxTotalWeight = currentWeights.size() == 0 ? 0 : currentWeights.get(currentWeights.size() - 1); 
// No server has been hit yet and total weight is not initialized
// fallback to use round robin
if (maxTotalWeight < 0.001d) {
    server =  super.choose(getLoadBalancer(), key);
    if(server == null) {
        return server;
    }
} else {
    // generate a random weight between 0 (inclusive) to maxTotalWeight (exclusive)
    double randomWeight = random.nextDouble() * maxTotalWeight;
    // pick the server index based on the randomIndex
    int n = 0;
    for (Double d : currentWeights) {
        if (d >= randomWeight) {
            serverIndex = n;
            break;
        } else {
            n++;
        }
    }

    server = allList.get(serverIndex);
}


当响应时间没有更新时,会默认走轮询策略;响应时间更新后,会根据权重取服务。现在的情况是,当我指定权重响应时间策略时,ribbon总是通过轮询取服务。

问题分析:

通过debug跟踪发现是rxjava的版本导致的。这里我以Ribbon的Example来说明一下:
package com.netflix.ribbon.examples.loadbalancer;

import com.google.common.collect.Lists;
import com.netflix.client.DefaultLoadBalancerRetryHandler;
import com.netflix.client.RetryHandler;
import com.netflix.loadbalancer.*;
import com.netflix.loadbalancer.reactive.LoadBalancerCommand;
import com.netflix.loadbalancer.reactive.ServerOperation;
import rx.Observable;

import java.net.HttpURLConnection;
import java.net.URL;
import java.util.List;

public class URLConnectionLoadBalancer {
    private final ILoadBalancer loadBalancer;
    // retry handler that does not retry on same server, but on a different server
    private final RetryHandler retryHandler = new DefaultLoadBalancerRetryHandler(0, 1, true);

    public URLConnectionLoadBalancer(List<Server> serverList) {
        loadBalancer = LoadBalancerBuilder.newBuilder().buildFixedServerListLoadBalancer(serverList);
    }

    public String call(final String path) throws Exception {
        return LoadBalancerCommand.<String>builder()
                .withLoadBalancer(loadBalancer)
                .build()
                .submit(new ServerOperation<String>() {
                    public Observable<String> call(Server server) {
                        URL url;
                        try {
                            url = new URL("http://" + server.getHost() + ":" + server.getPort() + path);
                            HttpURLConnection conn = (HttpURLConnection) url.openConnection();
                            return Observable.just(conn.getResponseMessage());
                        } catch (Exception e) {
                            return Observable.error(e);
                        }
                    }
                }).toBlocking().first();
    }

    public LoadBalancerStats getLoadBalancerStats() {
        return ((BaseLoadBalancer) loadBalancer).getLoadBalancerStats();
    }

    public static void main(String[] args) throws Exception {
        URLConnectionLoadBalancer urlLoadBalancer = new URLConnectionLoadBalancer(Lists.newArrayList(
                new Server("www.baidu.com", 80),
                new Server("github.com", 80),
                new Server("blog.csdn.net", 80)));
        for (int i = 0; i < 6; i++) {
            System.out.println(urlLoadBalancer.call("/"));
        }
        System.out.println("=== Load balancer stats ===");
        System.out.println(urlLoadBalancer.getLoadBalancerStats());
    }
}


1、Ribbon版本2.2.0,rxjava版本1.0.10
打印的日志:
=== Load balancer stats ===
Zone stats: {},Server stats: [[Server:github.com:80;	Zone:UNKNOWN;	Total Requests:2;	Successive connection failure:0;	Total blackout seconds:0;	Last connection made:Tue Aug 30 22:15:32 CST 2016;	First connection made: Tue Aug 30 22:15:31 CST 2016;	Active Connections:0;	total failure count in last (1000) msecs:0;	average resp time:895.0;	90 percentile resp time:979.0;	95 percentile resp time:979.0;	min resp time:811.0;	max resp time:979.0;	stddev resp time:84.0]
, [Server:www.baidu.com:80;	Zone:UNKNOWN;	Total Requests:2;	Successive connection failure:0;	Total blackout seconds:0;	Last connection made:Tue Aug 30 22:15:33 CST 2016;	First connection made: Tue Aug 30 22:15:32 CST 2016;	Active Connections:0;	total failure count in last (1000) msecs:0;	average resp time:162.0;	90 percentile resp time:164.0;	95 percentile resp time:164.0;	min resp time:160.0;	max resp time:164.0;	stddev resp time:2.0]
, [Server:blog.csdn.net:80;	Zone:UNKNOWN;	Total Requests:2;	Successive connection failure:0;	Total blackout seconds:0;	Last connection made:Tue Aug 30 22:15:33 CST 2016;	First connection made: Tue Aug 30 22:15:32 CST 2016;	Active Connections:0;	total failure count in last (1000) msecs:0;	average resp time:90.0;	90 percentile resp time:99.0;	95 percentile resp time:99.0;	min resp time:81.0;	max resp time:99.0;	stddev resp time:9.0]
]
2、 Ribbon版本2.2.0,rxjava版本1.1.5
打印的日志:
=== Load balancer stats ===
Zone stats: {},Server stats: [[Server:github.com:80;	Zone:UNKNOWN;	Total Requests:0;	Successive connection failure:0;	Total blackout seconds:0;	Last connection made:Tue Aug 30 22:27:20 CST 2016;	First connection made: Tue Aug 30 22:27:19 CST 2016;	Active Connections:2;	total failure count in last (1000) msecs:0;	average resp time:0.0;	90 percentile resp time:0.0;	95 percentile resp time:0.0;	min resp time:0.0;	max resp time:0.0;	stddev resp time:0.0]
, [Server:www.baidu.com:80;	Zone:UNKNOWN;	Total Requests:0;	Successive connection failure:0;	Total blackout seconds:0;	Last connection made:Tue Aug 30 22:27:21 CST 2016;	First connection made: Tue Aug 30 22:27:20 CST 2016;	Active Connections:2;	total failure count in last (1000) msecs:0;	average resp time:0.0;	90 percentile resp time:0.0;	95 percentile resp time:0.0;	min resp time:0.0;	max resp time:0.0;	stddev resp time:0.0]
, [Server:blog.csdn.net:80;	Zone:UNKNOWN;	Total Requests:0;	Successive connection failure:0;	Total blackout seconds:0;	Last connection made:Tue Aug 30 22:27:21 CST 2016;	First connection made: Tue Aug 30 22:27:20 CST 2016;	Active Connections:2;	total failure count in last (1000) msecs:0;	average resp time:0.0;	90 percentile resp time:0.0;	95 percentile resp time:0.0;	min resp time:0.0;	max resp time:0.0;	stddev resp time:0.0]
]
可以看到状态的更新出现了异常(平均响应时间没有得到更新)。

return operation.call(server).doOnEach(new Observer<T>() {
    private T entity;
    @Override
    public void onCompleted() {
        recordStats(tracer, stats, entity, null);
        // TODO: What to do if onNext or onError are never called?
    }

    @Override
    public void onError(Throwable e) {
        recordStats(tracer, stats, null, e);
        logger.debug("Got error {} when executed on server {}", e, server);
        if (listenerInvoker != null) {
            listenerInvoker.onExceptionWithServer(e, context.toExecutionInfo());
        }
    }

    @Override
    public void onNext(T entity) {
        this.entity = entity;
        if (listenerInvoker != null) {
            listenerInvoker.onExecutionSuccess(entity, context.toExecutionInfo());
        }
    }                            
    
    private void recordStats(Stopwatch tracer, ServerStats stats, Object entity, Throwable exception) {
        tracer.stop();
        loadBalancerContext.noteRequestCompletion(stats, entity, exception, tracer.getDuration(TimeUnit.MILLISECONDS), retryHandler);
    }
});

可以看到onCompleted方法有对状态的记录操作,onNext没有对状态进行任何操作。1.0.10版本正常执行完会走 onCompleted方法,从而更新状态,1.1.5版本不会进入onCompleted方法。
public void request(long n) {
    if(!this.once) {
        if(n < 0L) {
            throw new IllegalStateException("n >= required but it was " + n);
        } else if(n != 0L) {
            this.once = true;
            Subscriber a = this.actual;
            if(!a.isUnsubscribed()) {
                Object v = this.value;

                try {
                    a.onNext(v);
                } catch (Throwable var6) {
                    Exceptions.throwOrReport(var6, a, v);
                    return;
                }

                if(!a.isUnsubscribed()) {
                    a.onCompleted();
                }
            }
        }
    }
}



在调用onCompleted方法前会进行一次是否取消订阅的判断。

public void onNext(T i) {
    if(!this.isUnsubscribed() && this.count++ < OperatorTake.this.limit) {
        boolean stop = this.count == OperatorTake.this.limit;
        child.onNext(i);
        if(stop && !this.completed) {
            this.completed = true;

            try {
                child.onCompleted();
            } finally {
                this.unsubscribe();
            }
        }
    }

}

first方法会调用OperatorTake的call方法,在onNext方法执行完前会进行取消订阅的操作。

到这里,问题可以得到解决了。
这里提供两个解决方案:1.依赖1.0.10的版本;2.可以调用last方法或forEach方法。




评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值