airflow源码精读十一

TaskInstance的执行从run函数开始,进行依赖检查后更新状态为running。如果任务执行出错,则调用handle_failure进行错误处理,包括可能的重试操作。run_raw_task方法直接执行任务,不检查状态,完成后更新最终状态并处理回调。过程中涉及信号处理、模板渲染、XCom管理和任务超时处理。
摘要由CSDN通过智能技术生成

TaskInstance

任务的实例的执行入口为run.

  • 首先做任务的依赖检查 _check_and_change_state_before_execution,

  • 如果通过,则将任务的状态更新为running,执行任务

  • 任务出错,则调用handle_failure对错误进行处理——是否重试等操作.

   @provide_session
    def run(...):
        res = self._check_and_change_state_before_execution(
                verbose=verbose,
                ignore_all_deps=ignore_all_deps,
                ignore_depends_on_past=ignore_depends_on_past,
                ignore_task_deps=ignore_task_deps,
                ignore_ti_state=ignore_ti_state,
                mark_success=mark_success,
                test_mode=test_mode,
                job_id=job_id,
                pool=pool,
                session=session)
        if res:
            # 通过内部方法执行
            self._run_raw_task(
                    mark_success=mark_success,
                    test_mode=test_mode,
                    job_id=job_id,
                    pool=pool,
                    session=session)
    @provide_session
    def _run_raw_task(
            self,
            mark_success=False,
            test_mode=False,
            job_id=None,
            pool=None,
            session=None):
        """
        执行operator任务
        Immediately runs the task (without checking or changing db state
        before execution) and then sets the appropriate final state after
        completion and runs any post-execute callbacks. Meant to be called
        only after another function changes the state to running.

        :param mark_success: Don't run the task, mark its state as success
        :type mark_success: boolean
        :param test_mode: Doesn't record success or failure in the DB
        :type test_mode: boolean
        :param pool: specifies the pool to use to run the task instance
        :type pool: str
        """
        task = self.task
        self.pool = pool or task.pool
        self.test_mode = test_mode
        self.refresh_from_db(session=session)
        self.job_id = job_id
        self.hostname = socket.getfqdn()
        self.operator = task.__class__.__name__

        context = {}
        try:
            if not mark_success:
                context = self.get_template_context()
                # 拷贝task
                task_copy = copy.copy(task)
                self.task = task_copy
                # 接受外部信号
                # 这里的SIGTERM是正常的杀死,等待程序作出反应
                def signal_handler(signum, frame):
                    """Setting kill signal handler"""
                    self.log.error("Killing subprocess")
                    task_copy.on_kill()
                    raise AirflowException("Task received SIGTERM signal")
                signal.signal(signal.SIGTERM, signal_handler)

                # Don't clear Xcom until the task is certain to execute
                self.clear_xcom_data()

                self.render_templates()
                # 执行之前的处理
                task_copy.pre_execute(context=context)

                # If a timeout is specified for the task, make it fail
                # if it goes beyond
                result = None
                # 执行任务 调opreator定义的实际操作动作
                if task_copy.execution_timeout:
                    try:
                        with timeout(int(
                                task_copy.execution_timeout.total_seconds())):
                            # task_copy.execute(context=context) 根据 Operator 类型执行不同的处理逻辑
                            result = task_copy.execute(context=context)
                    except AirflowTaskTimeout:
                        task_copy.on_kill()
                        raise
                else:
                    # 没有超时设置
                    result = task_copy.execute(context=context)

                # If the task returns a result, push an XCom containing it
                if result is not None:
                    # 在xcom中上传结果
                    self.xcom_push(key=XCOM_RETURN_KEY, value=result)

                # TODO remove deprecated behavior in Airflow 2.0
                try:
                    # 执行post调用
                    task_copy.post_execute(context=context, result=result)
                except TypeError as e:
                    if 'unexpected keyword argument' in str(e):
                        warnings.warn(
                            'BaseOperator.post_execute() now takes two '
                            'arguments, `context` and `result`, but "{}" only '
                            'expected one. This behavior is deprecated and '
                            'will be removed in a future version of '
                            'Airflow.'.format(self.task_id),
                            category=DeprecationWarning)
                        task_copy.post_execute(context=context)
                    else:
                        raise

                Stats.incr('operator_successes_{}'.format(
                    self.task.__class__.__name__), 1, 1)
                Stats.incr('ti_successes')
            # 修改数据库状态
            self.refresh_from_db(lock_for_update=True)
            self.state = State.SUCCESS
        except AirflowSkipException:
           # ... 错误处理

        # Recording SUCCESS
        self.end_date = datetime.utcnow()
         # 	...
        session.commit()
            
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值