Executor 任务执行器
即用来执行任务。每个执行器都有一个并发度,表示当前正在执行的任务数量的极限。
执行器里面的任务分为三个部分
未执行的任务(self.queued*tasks)
正在执行的任务(self.runing)
已经执行完毕的任务(self.event_buffer)
执行器的子类里面包括
celery执行器
本地执行器
调试执行器
CeleryExecutor
celery是一个任务队列,用于分布式执行任务
CeleryExecutor用于远程执行任务,多进程模式,可分布在不同服务器间,通过共同连通的队列(redis,mq消息队列),实现分布式的生产者消费者模式,CeleryExecutor接到任务提交到分布式celery任务队列中,celery的woker消费指定队列的任务,执行其命令。
'''
To start the celery worker, run the command:
airflow worker
配置celery
'''
if configuration.has_option('celery', 'celery_config_options'):
celery_configuration = import_string(
configuration.get('celery', 'celery_config_options')
)
else:
celery_configuration = DEFAULT_CELERY_CONFIG
app = Celery(
configuration.get('celery', 'CELERY_APP_NAME'),
config_source=celery_configuration)
@app.task
def execute_command(command):
"""
Celery 通过装饰器app.task创建Task对象,Task对象提供两个核心功能:
将任务消息发送到队列和声明 Worker 接收到消息后需要执行的具体函数。
command 被格式化:airflow run <dag_id> <task_id> <execution_date> --local --pool <pool> -sd <python_file>
使用@app.task装饰器将该函数转换为Celery任务。
与call方法类似,不同在于如果命令行执行成功,check_call返回返回码0,否则抛出subprocess.CalledProcessError异常。
subprocess.CalledProcessError异常包括returncode、cmd、output等属性,
其中returncode是子进程的退出码,cmd是子进程的执行命令,output为None。
当子进程退出异常时,则报错
"""
log = LoggingMixin().log
log.info("Executing command in Celery: %s", command)
try:
"""
# 检查返回的call back
# 如果需要处理任务的结果,则需要使用回调函数等机制来获取结果
# shell=True表示在shell中运行命令
# check_call(command, shell=True)本地执行命令
check_call(["ls", "-l"])
"""
subprocess.check_call(command, shell=True)
except subprocess.CalledProcessError as e:
log.error(e)
raise AirflowException('Celery command failed')
class CeleryExecutor(BaseExecutor):
"""
CeleryExecutor is recommended for production use of Airflow. It allows
distributing the execution of task instances to multiple worker nodes.
Celery is a simple, flexible and reliable distributed system to process
vast amounts of messages, while providing operations with the tools
required to maintain such a system.
"""
def start(self):
self.tasks = {}
self.last_state = {}
# execute_command是Celery Task任务实例,下文会介绍
def execute_async(self, key, command,
queue=DEFAULT_CELERY_CONFIG['task_default_queue']):
self.log.info( "[celery] queuing {key} through celery, "
"queue={queue}".format(**locals()))
# 通过 execute_async 异步提交到 Celery 集群,将返回的任务句柄保存在 tasks。
self.tasks[key] = execute_command.apply_async(
args=[command], queue=queue)
self.last_state[key] = celery_states.PENDING
# 同步任务状态,根据任务状态进行不同处理
# Scheduler 通过 sync 方法轮询任务句柄获取任务状态,并根据任务状态回调 success 或者 fail 更新状态。
def sync(self):
self.log.debug("Inquiring about %s celery task(s)", len(self.tasks))
for key, async in list(self.tasks.items()):
try:
state = async.state
if self.last_state[key] != state:
if state == celery_states.SUCCESS:
self.success(key)
del self.tasks[key]
del self.last_state[key]
elif state == celery_states.FAILURE:
self.fail(key)
del self.tasks[key]
del self.last_state[key]
elif state == celery_states.REVOKED:
self.fail(key)
del self.tasks[key]
del self.last_state[key]
else:
self.log.info("Unexpected state: %s", async.state)
self.last_state[key] = async.state
except Exception as e:
self.log.error("Error syncing the celery executor, ignoring it:")
self.log.exception(e)
def end(self, synchronous=False):
if synchronous:
while any([
async.state not in celery_states.READY_STATES
for async in self.tasks.values()]):
time.sleep(5)
self.sync()
Celery Worker 的启动
执行命令 airflow worker,所谓 Worker 其实是 Celery 的工作进程,一个 Worker 根据 concurrency 启动若干个守护进程,用于任务的并发执行。 Celery的worker接受到消息执行taskInstance的 execute_command()
#cli.py
def worker(args):
env = os.environ.copy()
env['AIRFLOW_HOME'] = settings.AIRFLOW_HOME
# Celery worker
from airflow.executors.celery_executor import app as celery_app
from celery.bin import worker
worker = worker.worker(app=celery_app)
options = {
'optimization': 'fair',
'O': 'fair',
'queues': args.queues,
'concurrency': args.concurrency,
'hostname': args.celery_hostname,
}
worker.run(**options)
上面提交到 Celery 集群的命令 airflow run 在 Worker 守护进程中被 cli.py 的 run 方法解释执行
通过 airflow run 命令,在airflow 启动server进程后,解析命令运行至该run方法
#cli.py 任务在worker端执行的入口
def run(args, dag=None):
task = dag.get_task(task_id=args.task_id)
ti = TaskInstance(task, args.execution_date)
ti.refresh_from_db()
hostname = socket.getfqdn()
log.info("Running on host %s", hostname)
#local参数指定启动LocalTaskJob类型的Job,在LocalTaskJob内部指定参数raw从而启动_run_raw_task
if args.local:
run_job = jobs.LocalTaskJob(
task_instance=ti,
mark_success=args.mark_success,
pickle_id=args.pickle,
ignore_all_deps=args.ignore_all_dependencies,
ignore_depends_on_past=args.ignore_depends_on_past,
ignore_task_deps=args.ignore_dependencies,
ignore_ti_state=args.force,
pool=args.pool)
run_job.run()
elif args.raw:
ti._run_raw_task(
mark_success=args.mark_success,
job_id=args.job_id,
pool=args.pool,
)