【第21期】观点:人工智能到底用 GPU?还是用 FPGA?

Ribbon问题记录:高版本的rxjava导致Ribbon状态维护异常

原创 2016年08月30日 21:45:54

问题描述:

 Ribbon的权重响应时间策略(WeightedResponseTimeRule),是根据服务响应时间分配权重,响应时间越长,权重越小,被选中的可能性越低。
WeightedResponseTimeRule的choose方法的关键代码如下:
// last one in the list is the sum of all weights
double maxTotalWeight = currentWeights.size() == 0 ? 0 : currentWeights.get(currentWeights.size() - 1); 
// No server has been hit yet and total weight is not initialized
// fallback to use round robin
if (maxTotalWeight < 0.001d) {
    server =  super.choose(getLoadBalancer(), key);
    if(server == null) {
        return server;
    }
} else {
    // generate a random weight between 0 (inclusive) to maxTotalWeight (exclusive)
    double randomWeight = random.nextDouble() * maxTotalWeight;
    // pick the server index based on the randomIndex
    int n = 0;
    for (Double d : currentWeights) {
        if (d >= randomWeight) {
            serverIndex = n;
            break;
        } else {
            n++;
        }
    }

    server = allList.get(serverIndex);
}


当响应时间没有更新时,会默认走轮询策略;响应时间更新后,会根据权重取服务。现在的情况是,当我指定权重响应时间策略时,ribbon总是通过轮询取服务。

问题分析:

通过debug跟踪发现是rxjava的版本导致的。这里我以Ribbon的Example来说明一下:
package com.netflix.ribbon.examples.loadbalancer;

import com.google.common.collect.Lists;
import com.netflix.client.DefaultLoadBalancerRetryHandler;
import com.netflix.client.RetryHandler;
import com.netflix.loadbalancer.*;
import com.netflix.loadbalancer.reactive.LoadBalancerCommand;
import com.netflix.loadbalancer.reactive.ServerOperation;
import rx.Observable;

import java.net.HttpURLConnection;
import java.net.URL;
import java.util.List;

public class URLConnectionLoadBalancer {
    private final ILoadBalancer loadBalancer;
    // retry handler that does not retry on same server, but on a different server
    private final RetryHandler retryHandler = new DefaultLoadBalancerRetryHandler(0, 1, true);

    public URLConnectionLoadBalancer(List<Server> serverList) {
        loadBalancer = LoadBalancerBuilder.newBuilder().buildFixedServerListLoadBalancer(serverList);
    }

    public String call(final String path) throws Exception {
        return LoadBalancerCommand.<String>builder()
                .withLoadBalancer(loadBalancer)
                .build()
                .submit(new ServerOperation<String>() {
                    public Observable<String> call(Server server) {
                        URL url;
                        try {
                            url = new URL("http://" + server.getHost() + ":" + server.getPort() + path);
                            HttpURLConnection conn = (HttpURLConnection) url.openConnection();
                            return Observable.just(conn.getResponseMessage());
                        } catch (Exception e) {
                            return Observable.error(e);
                        }
                    }
                }).toBlocking().first();
    }

    public LoadBalancerStats getLoadBalancerStats() {
        return ((BaseLoadBalancer) loadBalancer).getLoadBalancerStats();
    }

    public static void main(String[] args) throws Exception {
        URLConnectionLoadBalancer urlLoadBalancer = new URLConnectionLoadBalancer(Lists.newArrayList(
                new Server("www.baidu.com", 80),
                new Server("github.com", 80),
                new Server("blog.csdn.net", 80)));
        for (int i = 0; i < 6; i++) {
            System.out.println(urlLoadBalancer.call("/"));
        }
        System.out.println("=== Load balancer stats ===");
        System.out.println(urlLoadBalancer.getLoadBalancerStats());
    }
}


1、Ribbon版本2.2.0,rxjava版本1.0.10
打印的日志:
=== Load balancer stats ===
Zone stats: {},Server stats: [[Server:github.com:80;	Zone:UNKNOWN;	Total Requests:2;	Successive connection failure:0;	Total blackout seconds:0;	Last connection made:Tue Aug 30 22:15:32 CST 2016;	First connection made: Tue Aug 30 22:15:31 CST 2016;	Active Connections:0;	total failure count in last (1000) msecs:0;	average resp time:895.0;	90 percentile resp time:979.0;	95 percentile resp time:979.0;	min resp time:811.0;	max resp time:979.0;	stddev resp time:84.0]
, [Server:www.baidu.com:80;	Zone:UNKNOWN;	Total Requests:2;	Successive connection failure:0;	Total blackout seconds:0;	Last connection made:Tue Aug 30 22:15:33 CST 2016;	First connection made: Tue Aug 30 22:15:32 CST 2016;	Active Connections:0;	total failure count in last (1000) msecs:0;	average resp time:162.0;	90 percentile resp time:164.0;	95 percentile resp time:164.0;	min resp time:160.0;	max resp time:164.0;	stddev resp time:2.0]
, [Server:blog.csdn.net:80;	Zone:UNKNOWN;	Total Requests:2;	Successive connection failure:0;	Total blackout seconds:0;	Last connection made:Tue Aug 30 22:15:33 CST 2016;	First connection made: Tue Aug 30 22:15:32 CST 2016;	Active Connections:0;	total failure count in last (1000) msecs:0;	average resp time:90.0;	90 percentile resp time:99.0;	95 percentile resp time:99.0;	min resp time:81.0;	max resp time:99.0;	stddev resp time:9.0]
]
2、Ribbon版本2.2.0,rxjava版本1.1.5
打印的日志:
=== Load balancer stats ===
Zone stats: {},Server stats: [[Server:github.com:80;	Zone:UNKNOWN;	Total Requests:0;	Successive connection failure:0;	Total blackout seconds:0;	Last connection made:Tue Aug 30 22:27:20 CST 2016;	First connection made: Tue Aug 30 22:27:19 CST 2016;	Active Connections:2;	total failure count in last (1000) msecs:0;	average resp time:0.0;	90 percentile resp time:0.0;	95 percentile resp time:0.0;	min resp time:0.0;	max resp time:0.0;	stddev resp time:0.0]
, [Server:www.baidu.com:80;	Zone:UNKNOWN;	Total Requests:0;	Successive connection failure:0;	Total blackout seconds:0;	Last connection made:Tue Aug 30 22:27:21 CST 2016;	First connection made: Tue Aug 30 22:27:20 CST 2016;	Active Connections:2;	total failure count in last (1000) msecs:0;	average resp time:0.0;	90 percentile resp time:0.0;	95 percentile resp time:0.0;	min resp time:0.0;	max resp time:0.0;	stddev resp time:0.0]
, [Server:blog.csdn.net:80;	Zone:UNKNOWN;	Total Requests:0;	Successive connection failure:0;	Total blackout seconds:0;	Last connection made:Tue Aug 30 22:27:21 CST 2016;	First connection made: Tue Aug 30 22:27:20 CST 2016;	Active Connections:2;	total failure count in last (1000) msecs:0;	average resp time:0.0;	90 percentile resp time:0.0;	95 percentile resp time:0.0;	min resp time:0.0;	max resp time:0.0;	stddev resp time:0.0]
]
可以看到状态的更新出现了异常(平均响应时间没有得到更新)。

return operation.call(server).doOnEach(new Observer<T>() {
    private T entity;
    @Override
    public void onCompleted() {
        recordStats(tracer, stats, entity, null);
        // TODO: What to do if onNext or onError are never called?
    }

    @Override
    public void onError(Throwable e) {
        recordStats(tracer, stats, null, e);
        logger.debug("Got error {} when executed on server {}", e, server);
        if (listenerInvoker != null) {
            listenerInvoker.onExceptionWithServer(e, context.toExecutionInfo());
        }
    }

    @Override
    public void onNext(T entity) {
        this.entity = entity;
        if (listenerInvoker != null) {
            listenerInvoker.onExecutionSuccess(entity, context.toExecutionInfo());
        }
    }                            
    
    private void recordStats(Stopwatch tracer, ServerStats stats, Object entity, Throwable exception) {
        tracer.stop();
        loadBalancerContext.noteRequestCompletion(stats, entity, exception, tracer.getDuration(TimeUnit.MILLISECONDS), retryHandler);
    }
});

可以看到onCompleted方法有对状态的记录操作,onNext没有对状态进行任何操作。1.0.10版本正常执行完会走onCompleted方法,从而更新状态,1.1.5版本不会进入onCompleted方法。
public void request(long n) {
    if(!this.once) {
        if(n < 0L) {
            throw new IllegalStateException("n >= required but it was " + n);
        } else if(n != 0L) {
            this.once = true;
            Subscriber a = this.actual;
            if(!a.isUnsubscribed()) {
                Object v = this.value;

                try {
                    a.onNext(v);
                } catch (Throwable var6) {
                    Exceptions.throwOrReport(var6, a, v);
                    return;
                }

                if(!a.isUnsubscribed()) {
                    a.onCompleted();
                }
            }
        }
    }
}



在调用onCompleted方法前会进行一次是否取消订阅的判断。

public void onNext(T i) {
    if(!this.isUnsubscribed() && this.count++ < OperatorTake.this.limit) {
        boolean stop = this.count == OperatorTake.this.limit;
        child.onNext(i);
        if(stop && !this.completed) {
            this.completed = true;

            try {
                child.onCompleted();
            } finally {
                this.unsubscribe();
            }
        }
    }

}

first方法会调用OperatorTake的call方法,在onNext方法执行完前会进行取消订阅的操作。

到这里,问题可以得到解决了。
这里提供两个解决方案:1.依赖1.0.10的版本;2.可以调用last方法或forEach方法。




版权声明:本文为博主原创文章,未经博主允许不得转载。 举报

相关文章推荐

Ribbon源码解析及常见问题

1.遇到的问题及对应源码     1.1.Ribbon LoadBalancer 请求缓存:         1.1.1.问题描述: 在基于 Rest 的微服务架构中,使用 Ribbon 来作为客户端...

[MFC9.0 Ribbon Fluent] 基于VS demo(MSMoneyDemo)的改造过程中遇到的问题记录

先简单介绍 MSMoneyDemo 的组成部分,如下图: 我去掉了 MenuBar | ToolBar, 保留了 CaptionBar | CategoryBar | LinkBar...

使用SharePoint Designer2010 向SharePoint2010的Ribbon中添加自定义操作

Form Name Ribbon Location <td vali

微服务实战系列文章

本系列文章为 dockone.io 首发,转载请标明出处,以示尊重!! http://dockone.io/people/hokingyang 希望读者通过本系列文章对微服务优缺点...
  • kenkao
  • kenkao
  • 2017-01-07 10:55
  • 2490

VS 2010 Ribbon

前几天一直以为我在vs08下只能一行一行的写XML代码来设计MFC中的ribbon界面了,连一个预览的界面都没有,要一行行的看菜单的层级关系,同时还要确定控件的图标,脑袋都要大了。而查到的那些资料一般都是界面很好处理的VB或者C#,08确实让MFC的Ribbon很尴尬!! 后来才在一篇文章中发现,原来08并没有纯正支持RIbbon,于是就用10了,虽然下载的BCG并不支持VS10,但是10的版本也可以设计出office2007的效果就可以了。 下面,以MFC多文档视图的ribbon界面为例,描述一下如何简洁的去做。 1.

微服务实战系列文章

本系列文章为 dockone.io 首发,转载请标明出处,以示尊重!! http://dockone.io/people/hokingyang 希望读者通过本系列文章对微服务优缺点...

ribbon菜单风格

http://jacklmoore.com/notes/css3-ribbon-menu/

解决BCGControlbar中Ribbon界面主按钮一直显示File的问题

BCGControlbar生成Ribbon界面时,在左上角一直有个File按钮,不管怎么修改ribbon的配置文件,以及修改button的文本内容都无法修改,一直是File显示,研究了好久终于找到方法...

Nebula维护者的新作SWT Ribbon放出!

   相信吗,这是SWT做的!这个就是Nebula项目维护者的新作~SWT Ribbon~前一段时间一直把玩的SWT Gantt也是他的作品,非常好用,功能十分强加~     读过他的代码的朋友也许都有感觉,他的代码思路非常清晰,代码量不大,但是丝丝入扣,寥寥几行就能把功能实现~高手中高手~     再赞一下Nebula的Grid,有了它,我
收藏助手
不良信息举报
您举报文章:深度学习:神经网络中的前向传播和反向传播算法推导
举报原因:
原因补充:

(最多只允许输入30个字)