JStorm源码分析（四）深入解读Task概念与实现（草稿版）

最新推荐文章于 2019-03-01 18:42:51 发布

EdgarLeo

最新推荐文章于 2019-03-01 18:42:51 发布

阅读量1.2k

点赞数

分类专栏： JStorm源码分析

本文链接：https://blog.csdn.net/u013126638/article/details/65627458

版权

JStorm源码分析专栏收录该内容

5 篇文章 0 订阅

订阅专栏

由于Task实现了Runnable接口，所以可以肯定的是，Task将会由某个线程来执行其run()方法，这其中包含的核心逻辑如下：

public void run() {
        try {
            taskShutdownDameon = this.execute();
        } catch (Throwable e) {
            LOG.error("init task take error", e);
            if (reportErrorDie != null) {
                reportErrorDie.report(e);
            } else {
                throw new RuntimeException(e);
            }

        }
    }

run()把核心逻辑放在了execute()方法当中，自己只做了异常捕获和处理的工作。这里有必要提一下，由于run()方法本身无法抛出异常（RuntimeException除外），所以这里借助TaskReportErrorAndDie的实例来汇报异常。如果这个对象为null的话，那么直接抛出RuntimeException。

execute()方法依次作了如下几件事：

public TaskShutdownDameon execute() throws Exception {

        taskSendTargets = echoToSystemBolt();

        // create thread to get tuple from zeroMQ,
        // and pass the tuple to bolt/spout
        taskTransfer = mkTaskSending(workerData);
        RunnableCallback baseExecutor = prepareExecutor();
        //set baseExecutors for update
        setBaseExecutors((BaseExecutors)baseExecutor);

        AsyncLoopThread executor_threads = new AsyncLoopThread(baseExecutor, false, Thread.MAX_PRIORITY, true);
        taskReceiver = mkTaskReceiver();

        List<AsyncLoopThread> allThreads = new ArrayList<AsyncLoopThread>();
        allThreads.add(executor_threads);

        LOG.info("Finished loading task " + componentId + ":" + taskId);

        taskShutdownDameon = getShutdown(allThreads, baseExecutor);
        return taskShutdownDameon;
    }

获取该Task对应的目标Task的信息，当前Task从上游节点获得元组之后，将会把元组按需发送给目标Task，从而使下游节点获得元组；
创建TaskTransfer实例，该实例用于发送元组；
实例化Executor实例，该实例包含了用户实现的IRichSpout或IRichBolt实例，用于执行具体的计算逻辑。之后创建并启动一个AsyncLoopThread线程实例executor_threads来执行Executor；

public BaseExecutors mkExecutor() {
        BaseExecutors baseExecutor = null;

        if (taskObj instanceof IBolt) {
            baseExecutor = new BoltExecutors(this);
        } else if (taskObj instanceof ISpout) {
            if (isSingleThread(stormConf) == true) {
                baseExecutor = new SingleThreadSpoutExecutors(this);
            } else {
                baseExecutor = new MultipleThreadSpoutExecutors(this);
            }
        }

        return baseExecutor;
    }

创建TaskReceiver实例，该实例用于接收元组；
如果用户出于某种原因想要停止任务，那么用户需要停止已经启动的executor_threads。为了达到这一目的，这里需要给用户提供TaskShutdownDameon实例。

在创建TaskTransfer实例时，除了需要提供当前任务的上下文信息之外，还需要提供一个KryoTupleSerializer，它通过Kyro序列化框架，对将要传输的元组进行序列化。

prepareExecutor()和mkExecutor()方法则负责根据当前节点的类型创建不同的Executors，总共有三种：BoltExecutors, SingleThreadSpoutExecutors, MultipleThreadSpoutExecutors。

mkTaskReceiver()方法根据上下文信息创建一个TaskReceiver对象，并按照Task编号将其放到反序列化队列中。

public TaskReceiver mkTaskReceiver() {
        String taskName = JStormServerUtils.getName(componentId, taskId);
        //if (isTaskBatchTuple)
        //    taskReceiver = new TaskBatchReceiver(this, taskId, stormConf, topologyContext, innerTaskTransfer, taskStatus, taskName);
        //else
        taskReceiver = new TaskReceiver(this, taskId, stormConf, topologyContext, innerTaskTransfer, taskStatus, taskName);
        deserializeQueues.put(taskId, taskReceiver.getDeserializeQueue());

        return taskReceiver;
    }

echoToSystemBolt()方法将启动信息"startup"传递给了系统节点，告知当前任务已经启动。

public TaskReceiver mkTaskReceiver() {
        String taskName = JStormServerUtils.getName(componentId, taskId);
        //if (isTaskBatchTuple)
        //    taskReceiver = new TaskBatchReceiver(this, taskId, stormConf, topologyContext, innerTaskTransfer, taskStatus, taskName);
        //else
        taskReceiver = new TaskReceiver(this, taskId, stormConf, topologyContext, innerTaskTransfer, taskStatus, taskName);
        deserializeQueues.put(taskId, taskReceiver.getDeserializeQueue());

        return taskReceiver;
    }

getShutdown()方法则获取了当前任务中所有在运行的线程，包括负责进行消息确认的ackerThread，负责接收消息执行反序列化的recvThreads，负责序列化并发送消息的recvThreads等。

public TaskShutdownDameon getShutdown(List<AsyncLoopThread> allThreads, RunnableCallback baseExecutor) {
        AsyncLoopThread ackerThread = null;
        if (baseExecutor instanceof SpoutExecutors) {
            ackerThread = ((SpoutExecutors) baseExecutor).getAckerRunnableThread();

            if (ackerThread != null) {
                allThreads.add(ackerThread);
            }
        }
        List<AsyncLoopThread> recvThreads = taskReceiver.getDeserializeThread();
        for (AsyncLoopThread recvThread : recvThreads) {
            allThreads.add(recvThread);
        }

        List<AsyncLoopThread> serializeThreads = taskTransfer.getSerializeThreads();
        allThreads.addAll(serializeThreads);
        TaskHeartbeatTrigger taskHeartbeatTrigger = ((BaseExecutors) baseExecutor).getTaskHbTrigger();

        TaskShutdownDameon shutdown = new TaskShutdownDameon(taskStatus, topologyId, taskId, allThreads, zkCluster, taskObj, this, taskHeartbeatTrigger);

        return shutdown;
    }

了解了Task 的核心逻辑之后，我们再来看看 Task 是如何被构造出来的：

构造方法获取基本的上下文信息之后，创建一个TaskReportErrorAndDie实例用于记录Task在创建和启动过程中发生的错误以及异常停止的情况，并注册一些ITaskHook钩子对象在用户上下文中。

 // create report error callback,
        // in fact it is storm_cluster.report-task-error
        ITaskReportErr reportError = new TaskReportError(zkCluster, topologyId, taskId);

        // report error and halt worker
        reportErrorDie = new TaskReportErrorAndDie(reportError, workHalt);
        this.taskStats = new TaskBaseMetric(topologyId, componentId, taskId);
        //register auto hook
        List<String> listHooks = Config.getTopologyAutoTaskHooks(stormConf);
        for (String hook : listHooks) {
            ITaskHook iTaskHook = (ITaskHook) Utils.newInstance(hook);
            userContext.addTaskHook(iTaskHook);
        }

接着是最关键的步骤：从上下文中获取需要执行的计算节点taskObj，这个过程是一个反序列化过程，之后还会详细讲到。

LOG.info("Begin to deserialize taskObj " + componentId + ":" + this.taskId);

        try {
            WorkerClassLoader.switchThreadContext();
            this.taskObj = Common.get_task_object(topologyContext.getRawTopology(), componentId, WorkerClassLoader.getInstance());
            WorkerClassLoader.restoreThreadContext();
        } catch (Exception e) {
            if (reportErrorDie != null) {
                reportErrorDie.report(e);
            } else {
                throw e;
            }
        }

最后，从配置项中判断一下当前的任务是否要执行成批次的BatchTuple，以此决定是否要启动批处理模式。

isTaskBatchTuple = ConfigExtension.isTaskBatchTuple(stormConf);
        LOG.info("Transfer/receive in batch mode :" + isTaskBatchTuple);

通过阅读Task的源代码之后，我们发现，Task由以下几个关键部件组成：

要执行的计算节点taskObj，这通过将其实例化为不同的Executors对象，创建并启动线程执行来完成；
负责将元组序列化并传送给其他节点的TaskTransfer实例，这个将在后面单独阐述；
负责将接收消息并反序列元组的TaskReceiver实例，这个也将在后面单独阐述；
一系列按照Task编号管理的DisruptorQueue实例，担当一个消息队列的作用，有关DisruptorQueue的内容将在后面单独阐述。

EdgarLeo

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
JStorm源码分析（四）深入解读Task概念与实现（草稿版）

由于Task实现了Runnable接口，所以可以肯定的是，Task将会由某个线程来执行其run()方法，这其中包含的核心逻辑如下：public void run() { try { taskShutdownDameon = this.execute(); } catch (Throwable e) { LOG.er
复制链接

扫一扫