阻止DAG回填:
catchup=False
设置最大并发数为1:
max_active_runs=1
schedule_interval 是任务时间设定,与Linux cron 时间不同
airflow cron 表达式: * * * * * * (分 时 月 年 周 秒)
dag_interval = “* * * * * */30” # 每30秒
dag_interval = "*/10 * * * *" # 每10分钟
测试
airflow tasks test DAG_name
task_name
2020-08-01
Pycharm调试DAG
修改~/airflow目录下的dags所在的目录
AirFlow开启webserver、scheduler、celery worker时要把Class().run()删除,不然会反复运行(差不多一两分钟运行一次)
# Airflow needs a home. `~/airflow` is the default, but you can put it
# somewhere else if you prefer (optional)
export AIRFLOW_HOME=~/airflow
#export AIRFLOW_HOME=/Users/wiliam/PycharmProjects/YouMi/ultraman/scheduler
# Install Airflow using the constraints file
AIRFLOW_VERSION=2.4.1
PYTHON_VERSION="$(python --version | cut -d " " -f 2 | cut -d "." -f 1-2)"
# For example: 3.7
CONSTRAINT_URL="https://raw.githubusercontent.com/apache/airflow/constraints-${AIRFLOW_VERSION}/constraints-${PYTHON_VERSION}.txt"
# For example: https://raw.githubusercontent.com/apache/airflow/constraints-2.4.1/constraints-3.7.txt
pip install "apache-airflow==${AIRFLOW_VERSION}" --constraint "${CONSTRAINT_URL}"
# The Standalone command will initialise the database, make a user,
# and start all components for you.
airflow standalone
# Visit localhost:8080 in the browser and use the admin account details
# shown on the terminal to login.
# Enable the example_bash_operator dag in the home page