【Temporal】执行childWorkflow的工作流程分析

萌兰三太子

于 2024-09-08 09:41:51 发布

阅读量273

点赞数 3

本文链接：https://blog.csdn.net/m0_47495420/article/details/142036196

版权

这里主要简单介绍下执行一个child workflow的工作流程，通过源码来分析分析。

源码路径：temporal-go/internal/workflow.go

客户端调用

首先我们通过客户端调用方法入口进入，通过方法调用找到主要的逻辑入口：

func (wc *workflowEnvironmentImpl) ExecuteChildWorkflow(
   params ExecuteWorkflowParams, callback ResultHandler, startedHandler func(r WorkflowExecution, e error),
) {
   if params.WorkflowID == "" {
      params.WorkflowID = wc.workflowInfo.WorkflowExecution.RunID + "_" + wc.GenerateSequenceID()
   }
   memo, err := getWorkflowMemo(params.Memo, wc.dataConverter)
   if err != nil {
      if wc.sdkFlags.tryUse(SDKFlagChildWorkflowErrorExecution, !wc.isReplay) {
         startedHandler(WorkflowExecution{}, &ChildWorkflowExecutionAlreadyStartedError{})
      }
      callback(nil, err)
      return
   }
   searchAttr, err := serializeSearchAttributes(params.SearchAttributes)
   if err != nil {
      if wc.sdkFlags.tryUse(SDKFlagChildWorkflowErrorExecution, !wc.isReplay) {
         startedHandler(WorkflowExecution{}, &ChildWorkflowExecutionAlreadyStartedError{})
      }
      callback(nil, err)
      return
   }


   attributes := &commandpb.StartChildWorkflowExecutionCommandAttributes{}


   attributes.Namespace = params.Namespace
   attributes.TaskQueue = &taskqueuepb.TaskQueue{Name: params.TaskQueueName, Kind: enumspb.TASK_QUEUE_KIND_NORMAL}
   attributes.WorkflowId = params.WorkflowID
   attributes.WorkflowExecutionTimeout = &params.WorkflowExecutionTimeout
   attributes.WorkflowRunTimeout = &params.WorkflowRunTimeout
   attributes.WorkflowTaskTimeout = &params.WorkflowTaskTimeout
   attributes.Input = params.Input
   attributes.WorkflowType = &commonpb.WorkflowType{Name: params.WorkflowType.Name}
   attributes.WorkflowIdReusePolicy = params.WorkflowIDReusePolicy
   attributes.ParentClosePolicy = params.ParentClosePolicy
   attributes.RetryPolicy = params.RetryPolicy
   attributes.Header = params.Header
   attributes.Memo = memo
   attributes.SearchAttributes = searchAttr
   if len(params.CronSchedule) > 0 {
      attributes.CronSchedule = params.CronSchedule
   }
   attributes.UseCompatibleVersion = determineUseCompatibleFlagForCommand(
      params.VersioningIntent, wc.workflowInfo.TaskQueueName, params.TaskQueueName)


   command, err := wc.commandsHelper.startChildWorkflowExecution(attributes)
   if _, ok := err.(*childWorkflowExistsWithId); ok {
      if wc.sdkFlags.tryUse(SDKFlagChildWorkflowErrorExecution, !wc.isReplay) {
         startedHandler(WorkflowExecution{}, &ChildWorkflowExecutionAlreadyStartedError{})
      }
      callback(nil, &ChildWorkflowExecutionAlreadyStartedError{})
      return
   }
   command.setData(&scheduledChildWorkflow{
      resultCallback:      callback,
      startedCallback:     startedHandler,
      waitForCancellation: params.WaitForCancellation,
   })


   wc.logger.Debug("ExecuteChildWorkflow",
      tagChildWorkflowID, params.WorkflowID,
      tagWorkflowType, params.WorkflowType.Name)
}

上面代码逻辑大致如下：

1. 对执行参数进行一些处理
2. 构造执行命令对象，异步处理

命令池

执行child workflow不是直接过程式的代码流程，而是构造了一个命令对象，将其放入命令池中，等待执行。

func (h *commandsHelper) startChildWorkflowExecution(attributes *commandpb.StartChildWorkflowExecutionCommandAttributes) (commandStateMachine, error) {
   command := h.newChildWorkflowCommandStateMachine(attributes)
   if h.commands[command.getID()] != nil {
      return nil, &childWorkflowExistsWithId{id: attributes.WorkflowId}
   }
   h.addCommand(command)
   return command, nil
}

上面代码逻辑大致如下：

1. 构造workflow执行命令状态机，初始状态为commandStateCreated
2. 同时命令状态机中设置history事件历史，初始化为[]string{commandStateCreated.String()}
3. 将命令放入全局命令池中（有序队列）
4. 同时递增全局的nextCommandEventId

获取命令

既然放入了命令池，那么又是什么时候去取出来执行的呢，下面是获取命令的代码：

func (h *commandsHelper) getCommand(id commandID) commandStateMachine {
   command, ok := h.commands[id]
   if !ok {
      panicMsg := fmt.Sprintf("unknown command %v, possible causes are nondeterministic workflow definition code"+
         " or incompatible change in the workflow definition", id)
      panicIllegalState(panicMsg)
   }
   return command.Value.(commandStateMachine)
}

上面就是从命令队列中取出一个指定的ID命令（对于workflow来说，这里的id就是workflow id）

执行入口

下面来看下具体获取命令的方法调用。我们可以看到有很多，比如child workflow 执行、取消、关闭失败等地方。
我们知道，命令状态机初始化的状态是commandStateCreated，那么我就找到这个状态转换的代码：

func (d *childWorkflowCommandStateMachine) getCommand() *commandpb.Command {
   switch d.state {
   case commandStateCreated:
      command := createNewCommand(enumspb.COMMAND_TYPE_START_CHILD_WORKFLOW_EXECUTION)
      command.Attributes = &commandpb.Command_StartChildWorkflowExecutionCommandAttributes{StartChildWorkflowExecutionCommandAttributes: d.attributes}
      return command
   case commandStateCanceledAfterStarted:
      command := createNewCommand(enumspb.COMMAND_TYPE_REQUEST_CANCEL_EXTERNAL_WORKFLOW_EXECUTION)
      command.Attributes = &commandpb.Command_RequestCancelExternalWorkflowExecutionCommandAttributes{RequestCancelExternalWorkflowExecutionCommandAttributes: &commandpb.RequestCancelExternalWorkflowExecutionCommandAttributes{
         Namespace:         d.attributes.Namespace,
         WorkflowId:        d.attributes.WorkflowId,
         ChildWorkflowOnly: true,
      }}
      return command
   default:
      return nil
   }
}

通过层层调用我们可以看到，最终是由这里触发获取命令执行的：

func (bw *baseWorker) runTaskDispatcher() {
   defer bw.stopWG.Done()


   for i := 0; i < bw.options.maxConcurrentTask; i++ {
      bw.pollerRequestCh <- struct{}{}
   }


   for {
      // wait for new task or worker stop
      select {
      case <-bw.stopCh:
         return
      case task := <-bw.taskQueueCh:
         // for non-polled-task (local activity result as task), we don't need to rate limit
         _, isPolledTask := task.(*polledTask)
         if isPolledTask && bw.taskLimiter.Wait(bw.limiterContext) != nil {
            if bw.isStop() {
               return
            }
         }
         bw.stopWG.Add(1)
         go bw.processTask(task)
      }
   }
}

这个代码就是我们在另外一篇文章讲到的，在temporal的worker启动的时候，会启动这样一个loop循环，不断的去执行任务。