Ribbon问题记录:高版本的rxjava导致Ribbon状态维护异常

原创 2016年08月30日 21:45:54

问题描述:

 Ribbon的权重响应时间策略(WeightedResponseTimeRule),是根据服务响应时间分配权重,响应时间越长,权重越小,被选中的可能性越低。
WeightedResponseTimeRule的choose方法的关键代码如下:
// last one in the list is the sum of all weights
double maxTotalWeight = currentWeights.size() == 0 ? 0 : currentWeights.get(currentWeights.size() - 1); 
// No server has been hit yet and total weight is not initialized
// fallback to use round robin
if (maxTotalWeight < 0.001d) {
    server =  super.choose(getLoadBalancer(), key);
    if(server == null) {
        return server;
    }
} else {
    // generate a random weight between 0 (inclusive) to maxTotalWeight (exclusive)
    double randomWeight = random.nextDouble() * maxTotalWeight;
    // pick the server index based on the randomIndex
    int n = 0;
    for (Double d : currentWeights) {
        if (d >= randomWeight) {
            serverIndex = n;
            break;
        } else {
            n++;
        }
    }

    server = allList.get(serverIndex);
}


当响应时间没有更新时,会默认走轮询策略;响应时间更新后,会根据权重取服务。现在的情况是,当我指定权重响应时间策略时,ribbon总是通过轮询取服务。

问题分析:

通过debug跟踪发现是rxjava的版本导致的。这里我以Ribbon的Example来说明一下:
package com.netflix.ribbon.examples.loadbalancer;

import com.google.common.collect.Lists;
import com.netflix.client.DefaultLoadBalancerRetryHandler;
import com.netflix.client.RetryHandler;
import com.netflix.loadbalancer.*;
import com.netflix.loadbalancer.reactive.LoadBalancerCommand;
import com.netflix.loadbalancer.reactive.ServerOperation;
import rx.Observable;

import java.net.HttpURLConnection;
import java.net.URL;
import java.util.List;

public class URLConnectionLoadBalancer {
    private final ILoadBalancer loadBalancer;
    // retry handler that does not retry on same server, but on a different server
    private final RetryHandler retryHandler = new DefaultLoadBalancerRetryHandler(0, 1, true);

    public URLConnectionLoadBalancer(List<Server> serverList) {
        loadBalancer = LoadBalancerBuilder.newBuilder().buildFixedServerListLoadBalancer(serverList);
    }

    public String call(final String path) throws Exception {
        return LoadBalancerCommand.<String>builder()
                .withLoadBalancer(loadBalancer)
                .build()
                .submit(new ServerOperation<String>() {
                    public Observable<String> call(Server server) {
                        URL url;
                        try {
                            url = new URL("http://" + server.getHost() + ":" + server.getPort() + path);
                            HttpURLConnection conn = (HttpURLConnection) url.openConnection();
                            return Observable.just(conn.getResponseMessage());
                        } catch (Exception e) {
                            return Observable.error(e);
                        }
                    }
                }).toBlocking().first();
    }

    public LoadBalancerStats getLoadBalancerStats() {
        return ((BaseLoadBalancer) loadBalancer).getLoadBalancerStats();
    }

    public static void main(String[] args) throws Exception {
        URLConnectionLoadBalancer urlLoadBalancer = new URLConnectionLoadBalancer(Lists.newArrayList(
                new Server("www.baidu.com", 80),
                new Server("github.com", 80),
                new Server("blog.csdn.net", 80)));
        for (int i = 0; i < 6; i++) {
            System.out.println(urlLoadBalancer.call("/"));
        }
        System.out.println("=== Load balancer stats ===");
        System.out.println(urlLoadBalancer.getLoadBalancerStats());
    }
}


1、Ribbon版本2.2.0,rxjava版本1.0.10
打印的日志:
=== Load balancer stats ===
Zone stats: {},Server stats: [[Server:github.com:80;	Zone:UNKNOWN;	Total Requests:2;	Successive connection failure:0;	Total blackout seconds:0;	Last connection made:Tue Aug 30 22:15:32 CST 2016;	First connection made: Tue Aug 30 22:15:31 CST 2016;	Active Connections:0;	total failure count in last (1000) msecs:0;	average resp time:895.0;	90 percentile resp time:979.0;	95 percentile resp time:979.0;	min resp time:811.0;	max resp time:979.0;	stddev resp time:84.0]
, [Server:www.baidu.com:80;	Zone:UNKNOWN;	Total Requests:2;	Successive connection failure:0;	Total blackout seconds:0;	Last connection made:Tue Aug 30 22:15:33 CST 2016;	First connection made: Tue Aug 30 22:15:32 CST 2016;	Active Connections:0;	total failure count in last (1000) msecs:0;	average resp time:162.0;	90 percentile resp time:164.0;	95 percentile resp time:164.0;	min resp time:160.0;	max resp time:164.0;	stddev resp time:2.0]
, [Server:blog.csdn.net:80;	Zone:UNKNOWN;	Total Requests:2;	Successive connection failure:0;	Total blackout seconds:0;	Last connection made:Tue Aug 30 22:15:33 CST 2016;	First connection made: Tue Aug 30 22:15:32 CST 2016;	Active Connections:0;	total failure count in last (1000) msecs:0;	average resp time:90.0;	90 percentile resp time:99.0;	95 percentile resp time:99.0;	min resp time:81.0;	max resp time:99.0;	stddev resp time:9.0]
]
2、Ribbon版本2.2.0,rxjava版本1.1.5
打印的日志:
=== Load balancer stats ===
Zone stats: {},Server stats: [[Server:github.com:80;	Zone:UNKNOWN;	Total Requests:0;	Successive connection failure:0;	Total blackout seconds:0;	Last connection made:Tue Aug 30 22:27:20 CST 2016;	First connection made: Tue Aug 30 22:27:19 CST 2016;	Active Connections:2;	total failure count in last (1000) msecs:0;	average resp time:0.0;	90 percentile resp time:0.0;	95 percentile resp time:0.0;	min resp time:0.0;	max resp time:0.0;	stddev resp time:0.0]
, [Server:www.baidu.com:80;	Zone:UNKNOWN;	Total Requests:0;	Successive connection failure:0;	Total blackout seconds:0;	Last connection made:Tue Aug 30 22:27:21 CST 2016;	First connection made: Tue Aug 30 22:27:20 CST 2016;	Active Connections:2;	total failure count in last (1000) msecs:0;	average resp time:0.0;	90 percentile resp time:0.0;	95 percentile resp time:0.0;	min resp time:0.0;	max resp time:0.0;	stddev resp time:0.0]
, [Server:blog.csdn.net:80;	Zone:UNKNOWN;	Total Requests:0;	Successive connection failure:0;	Total blackout seconds:0;	Last connection made:Tue Aug 30 22:27:21 CST 2016;	First connection made: Tue Aug 30 22:27:20 CST 2016;	Active Connections:2;	total failure count in last (1000) msecs:0;	average resp time:0.0;	90 percentile resp time:0.0;	95 percentile resp time:0.0;	min resp time:0.0;	max resp time:0.0;	stddev resp time:0.0]
]
可以看到状态的更新出现了异常(平均响应时间没有得到更新)。

return operation.call(server).doOnEach(new Observer<T>() {
    private T entity;
    @Override
    public void onCompleted() {
        recordStats(tracer, stats, entity, null);
        // TODO: What to do if onNext or onError are never called?
    }

    @Override
    public void onError(Throwable e) {
        recordStats(tracer, stats, null, e);
        logger.debug("Got error {} when executed on server {}", e, server);
        if (listenerInvoker != null) {
            listenerInvoker.onExceptionWithServer(e, context.toExecutionInfo());
        }
    }

    @Override
    public void onNext(T entity) {
        this.entity = entity;
        if (listenerInvoker != null) {
            listenerInvoker.onExecutionSuccess(entity, context.toExecutionInfo());
        }
    }                            
    
    private void recordStats(Stopwatch tracer, ServerStats stats, Object entity, Throwable exception) {
        tracer.stop();
        loadBalancerContext.noteRequestCompletion(stats, entity, exception, tracer.getDuration(TimeUnit.MILLISECONDS), retryHandler);
    }
});

可以看到onCompleted方法有对状态的记录操作,onNext没有对状态进行任何操作。1.0.10版本正常执行完会走onCompleted方法,从而更新状态,1.1.5版本不会进入onCompleted方法。
public void request(long n) {
    if(!this.once) {
        if(n < 0L) {
            throw new IllegalStateException("n >= required but it was " + n);
        } else if(n != 0L) {
            this.once = true;
            Subscriber a = this.actual;
            if(!a.isUnsubscribed()) {
                Object v = this.value;

                try {
                    a.onNext(v);
                } catch (Throwable var6) {
                    Exceptions.throwOrReport(var6, a, v);
                    return;
                }

                if(!a.isUnsubscribed()) {
                    a.onCompleted();
                }
            }
        }
    }
}



在调用onCompleted方法前会进行一次是否取消订阅的判断。

public void onNext(T i) {
    if(!this.isUnsubscribed() && this.count++ < OperatorTake.this.limit) {
        boolean stop = this.count == OperatorTake.this.limit;
        child.onNext(i);
        if(stop && !this.completed) {
            this.completed = true;

            try {
                child.onCompleted();
            } finally {
                this.unsubscribe();
            }
        }
    }

}

first方法会调用OperatorTake的call方法,在onNext方法执行完前会进行取消订阅的操作。

到这里,问题可以得到解决了。
这里提供两个解决方案:1.依赖1.0.10的版本;2.可以调用last方法或forEach方法。




版权声明:本文为博主原创文章,未经博主允许不得转载。

Ribbon 和 Eureka 的集成

Ribbon 是 Netflix 发布的云中间层服务开源项目,其主要功能是提供客户侧软件负载均衡算法,将 Netflix 的中间层服务连接在一起。Eureka 是一个 RESTful 服务,用来定位运...
  • defonds
  • defonds
  • 2014年07月21日 14:37
  • 25946

深入理解Ribbon之源码解析

Ribbon是Netflix公司开源的一个负载均衡的项目,它属于上述的第二种,是一个客户端负载均衡器,运行在客户端上。它是一个经过了云端测试的IPC库,可以很好地控制HTTP和TCP客户端的一些行为。...
  • forezp
  • forezp
  • 2017年07月08日 14:48
  • 23484

ribbon 2.2.0 学习笔记

ribbon 2.2.0 学习笔记概述 参考 http://www.jianshu.com/p/19bcd9acf559 http://blog.csdn.net/neosmith/article/...

Ribbon客户端负载均衡(译)

客户端负载均衡:RibbonRibbon是一个客户端的负载均衡器,可以提供很多HTTP和TCP的控制行为。Feign已经使用了Ribbon,所以如果你使用了@FeignClient,Riboon也同样...

关于在Spring Cloud Feign工程中使用Hystrix配置不生效的问题

在《spring cloud 微服务实战》第211页--------Hystrix配置这一部分,书上说在Spring Cloud Feign中,还引入了服务保护与容错的工具Hystrix,默认情况下,...

zuul报forward错误问题 com.netflix.zuul.exception.ZuulException: Forwarding error

最近使用zuul的时候总出现forward报错的问题,上代码: com.netflix.zuul.exception.ZuulException: Forwarding error at org.s...

sbc(五)Hystrix-服务容错与保护

前言看过 应用限流的朋友应该知道,限流的根本目的就是为了保障服务的高可用。本次再借助SpringCloud中的集成的Hystrix组件来谈谈服务容错。其实产生某项需求的原因都是为了解决某个需求。当我们...

关于在Spring Cloud Feign工程中使用Ribbon配置不生效的问题

在《spring cloud 微服务实战》第209页,声明式服务调用:Spring Cloud Feign---------Ribbon配置这一部分。书上介绍说:由于Spring Cloud Feig...

Visual C++ 2012 动态创建Ribbon 按钮若干问题

项目因为业务需求需要动态创建 Ribbon 按钮,在网上找了一些代码,如下: void CMainFrame::OnButton2() {     // TODO: 在此添加命令处理程序代码   ...

3.springcloud中使用Ribbon和Feign调用服务以及服务的高可用

springcloud中使用Ribbon和Feign调用服务以及服务的高可用
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:Ribbon问题记录:高版本的rxjava导致Ribbon状态维护异常
举报原因:
原因补充:

(最多只允许输入30个字)