MySQL安装
MySQL安装参考我其他文档
https://blog.csdn.net/ciqingloveless/article/details/82883633
提前在MySQL建立Airflow数据库(数据库名必须为airflow,否则配置文件中很多数据库连接需要修改)
创建airflow用户(用户名也必须是这个,否则修改修改配置文件的数据库连接)
create user 'airflow'@'%' identified by 'airflow';
grant all privileges on `airflow`.* to 'airflow'@'%';
Python3 安装
请参考以下文档
https://blog.csdn.net/ciqingloveless/article/details/88640377
Airflow安装
设置环境变量
export AIRFLOW_HOME=/app/airflow
export AIRFLOW_GPL_UNIDECODE=yes
使用pip安装Airflow
#安装相关依赖,用于解决报错1
yum -y install cyrus-sasl cyrus-sasl-devel cyrus-sasl-lib
pip3 install apache-airflow[all]
配置Airflow元数据放入MySQL数据库
由于sqlite属于轻量级数据库,在生产环境使用备份恢复查询等不方便且有安全性问题,所以将元数据切换为MySQL数据库
/app/python/python-3.7.0/bin/airflow
执行如上语句,由于上文设置了环境变量,所以会在/app/airflow目录下生成airflow.cfg,这么执行命令会有如下报错,不需例会
修改airflow.cfg文件
sql_alchemy_conn = mysql://username:password@127.0.0.1:3306/dbname
executor = LocalExecutor
然后拷贝到/app/airflow下覆盖原文件
执行
/app/python/python-3.7.0/bin/airflow initdb
启动Airflow
/app/python/python-3.7.0/bin/airflow webserver -p 8080
/app/python/python-3.7.0/bin/airflow scheduler
问题排查
报错1:
sasl/saslwrapper.h:22:23: fatal error: sasl/sasl.h: No such file or directory
Command "/app/python/python-3.7.0/bin/python3.7 -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-hlzdt689/sasl/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-vpkgqvq8/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-hlzdt689/sasl/
报错2:
ERROR [airflow.models.DagBag] Failed to import: /app/python/python-3.7.0/lib/python3.7/site-packages/airflow/example_dags/example_http_operator.py
这个东西没有仔细查看原因,但是根据名字这个报错文件是一个例子,所以可以删除,所以直接删除解决,有兴趣的可以自己修改一下代码,我觉得没意义就不修改了
rm -rf /app/python/python-3.7.0/lib/python3.7/site-packages/airflow/example_dags/example_http_operator.py
报错3
启动Airflow web的时候报错如下,这是因为找不到gunicorn指令,因为我安装Python的方式没有将包打入/usr/bin中,怕影响系统启动(由于Centos很多指令是Python脚本的,假如覆盖原有Python可能造成系统某些功能不可用,所以也不建议覆盖原来Python指令)
FileNotFoundError: [Errno 2] No such file or directory: 'gunicorn': 'gunicorn'
执行如下指令
find / -name "gunicorn"
找到指令,然后软连接到/usr/bin中即可
ln -s /app/python/python-3.7.0/bin/gunicorn /usr/bin/gunicorn
报错4
[2019-03-19 23:33:21,975] {__init__.py:51} INFO - Using executor SequentialExecutor
[2019-03-19 23:33:22,576] {cli_action_loggers.py:69} ERROR - Failed on pre-execution callback using <function default_action_log at 0x7fdddf73e598>
Traceback (most recent call last):
File "/app/python/python-3.7.0/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1236, in _execute_context
cursor, statement, parameters, context
File "/app/python/python-3.7.0/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 536, in do_execute
cursor.execute(statement, parameters)
sqlite3.OperationalError: no such table: log
Traceback (most recent call last):
File "/app/python/python-3.7.0/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1236, in _execute_context
cursor, statement, parameters, context
File "/app/python/python-3.7.0/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 536, in do_execute
cursor.execute(statement, parameters)
sqlite3.OperationalError: no such table: job
这个是由于环境变量引起的,因为
export AIRFLOW_HOME=/app/airflow
export AIRFLOW_GPL_UNIDECODE=yes
这两个参数这么编写是临时设置,为了解决这个问题,需要将这两个变量添加至环境变量
cd ~/
vi .bash_profile
source .bash_profile