环境
ubuntu 16.04
python3.5 && python3.6
apt-get install python3-pip
pip3 install pipenv
pipenv install flask==1.0
1.需要的 python 模块
import ssl
import sqlite3
2.安装 airflow
网上的 pip install apache-airflow[all] 勿用太重
pipenv install apache-airflow==1.9 (现在 1.10 了)
but 你需要额外安装
pipenv install apache-airflow[crypto]
pipenv install apache-airflow[mysql]
if mysql-config error
sudo apt-get install libmysqlclient-dev python3-dev
3.if gcc error
根据python 环境安装 dev
apt-get install python3.5-dev (请根据自己的 python 版本安装 dev)
4.添加 airflow 到环境变量
echo "export AIRFLOW_HOME=~/airflow" >> ~/.bashrc
source ~/.bashrc
5.mysql 数据库配置
mysql -u{user_name} -p{pwd}
create database airflowdb;
6.if pip3 install pymssql ERROR
pip install setuptools_git
pip download pymssql
tar -zxvf pymssql-2.1.3.tar.gz
cd pymssql-2.1.3
export PYMSSQL_BUILD_WITH_BUNDLED_FREETDS=1
python setup.py install
7.配置airflow 连接 mysql.airflowdb
vim ~/airflow/airflow.cfg
executor = LocalExecutor
sql_alchemy_conn = mysql://root:admin@localhost:3306/airflowdb
8.在 airflow webserver 发现进程号重复,请
kill -9 PID
9.删除dag
mysql
databases库 airflowdb
tables表 dag
10.添加用户, 开启登录页面
在 webserver 下面的第一行添加认证, 加在authenticate 的默认位置下面无效
authenticate = True
auth_backend = airflow.contrib.auth.backends.password_auth
cd ~/airflow
pipenv shell
python3
>>> mport airflow
>>> from airflow import models, settings
>>> from airflow.contrib.auth.backends.password_auth import PasswordUser
>>> user = PasswordUser(models.User()) user.username = 'username'
>>> user.email = 'email'
>>> user.password = 'pwd'
>>> session = settings.Session()
>>> session.add(user)
>>> session.commit()
>>> session.close()
>>> exit()
pipenv shell
pipenv install apache-airflow[mysql]
airflow initdb
11.全部安装成功请依次运行命令
airflow initdb
airflow scheduler
airflow webserver
12.airflow 修改时区
1.在airflow家目录下修改airflow.cfg,设置时区为上海
default_timezone = Asia/Shanghai
2.找到虚拟环境下的 airflow 文件, pipenv 和 mkvirtualenv 安装的虚拟环境都在.virtualenv 中
进入airflow包的安装位置,也就是site-packages的位置,以下修改文件均为相对位置
cd /root/.virtualenvs/af/lib/python3.5/site-packages/
3.修改 airflow/utils/timezone.py
from airflow import configuration as conf
try:
tz = conf.get("core", "default_timezone")
if tz == "system":
utc = pendulum.local_timezone()
else:
utc = pendulum.timezone(tz)
except Exception:
pass
4.修改utcnow()函数 (在第69行)
原代码 d = dt.datetime.utcnow()
修改为 d = dt.datetime.now()
5.修改airflow/utils/sqlalchemy.py
from airflow import configuration as conf
try:
tz = conf.get("core", "default_timezone")
if tz == "system":
utc = pendulum.local_timezone()
else:
utc = pendulum.timezone(tz)
except Exception:
pass
6.airflow/www/templates/admin/master.html(第31行)
把代码 var UTCseconds = (x.getTime() + x.getTimezoneOffset()*60*1000);
改为 var UTCseconds = x.getTime();
把代码 "timeFormat":"H:i:s %UTC%",
改为 "timeFormat":"H:i:s",
直观的界面