Airflow的安装:在线安装、离线安装、问题汇总

Airflow的安装:在线安装、离线安装、常见问题汇总

文章目录

前言

由于xx区需要使用airflow,但xx区无法连接外网。所以需要在处于互联网区的某台机器上安装调试airflow、并将airflow所需的模块及其他程序打包下载下来,供xx区机器离线安装使用。
如下是环境/版本信息:

[flowuser@VM_0_16_centos ~]$ uname -a
Linux VM_0_16_centos 3.10.0-514.26.2.el7.x86_64 #1 SMP Tue Jul 4 15:04:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
[flowuser@VM_0_16_centos ~]$ cat /etc/redhat-release 
CentOS Linux release 7.2.1511 (Core) 
[flowuser@VM_0_16_centos ~]$ pip -V
pip 19.0.1 from /data/flowuser/python/lib/python3.6/site-packages/pip (python 3.6)
[flowuser@VM_0_16_centos ~]$ python
Python 3.6.5 (default, Jan 29 2019, 11:10:11) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-36)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import airflow
>>> airflow.__version__
'1.10.2'
>>> 

理想的安装过程

理想状态下,可能只需要如下几条指令即可完成python+airflow的安装:

#------------安装python3.6.5----------------#
./configure --prefix=/data/flowuser/python

#设置环境变量,注意$PYTHONPATH要放到$PATH之前,否则会先找$PATH中的目录下的python
#注意设置完后加载一下: . $HOME/.bash_profile
PYTHONPATH=/data/flowuser/python/bin:/data/flowuser/python/lib/python3.6/site-packages
PATH=$PYTHONPATH:$PATH:$HOME/.local/bin:$HOME/bin

#----------安装airflow----------------#
#先设置两个环境变量
export AIRFLOW_HOME=~/airflow
export AIRFLOW_GPL_UNIDECODE=yes

#####在线安装airflow
pip install apache-airflow[all]
#初始化DB
airflow initdb


######如果离线安装airflow
#先把airflow所需的模块下载下来,假设下载在A目录
pip download apache-airflow[all]
#然后在A目录下执行
pip install apache-airflow[all] --no-index -f ./
#然后初始化DB
airflow initdb

#启动看看 
airflow webserver -p 8080

安装总结

对于特定环境(也就是与我相同的OS版本、相同的python/airflow版本)是否都会出现同样的问题。如果不是偶然,则如下安装方式就算是本特定环境的“最佳实践”:

#-------------安装必要的组件,注意yum需要使用root----------------#
#安装GCC
yum install gcc

#安装sqlite-devel组件,这个要提前安装,否则python要重装
yum install sqlite-devel

#安装openssl-devel ,否则会报 ssl module无效
yum install openssl-devel -y 

#安装gcc-c++,否则会报cc1plus不存在
yum install gcc-c++

#安装 mysql_config组件。否则会报  OSError: mysql_config not found
yum install mysql-devel gcc gcc-devel python-devel

#安装sasl相关组件,否则会报 sasl/sasl.h不存在
yum install cyrus-sasl-lib.x86_64
yum install cyrus-sasl-devel.x86_64
yum install libgsasl-devel.x86_64
#yum install saslwrapper-devel.x86_64 #这个我看不用执行


#------------python安装-----------------------------------------#
#获取python安装包
wget https://www.python.org/ftp/python/3.6.5/Python-3.6.5.tgz

#编译和安装python安装包下的zlib(使用root):进入python的安装目录下的Modules/zlib 
cd python安装包所在目录/Modules/zlib/
./configure  
make install 

#编译python
cd python安装包所在目录/
./configure --prefix=/data/flowuser/python --enable-shared --with-ssl
make
make install


#添加软链接
ln -s /data/flowuser/python/bin/python3 /data/flowuser/python/bin/python
#添加python相关环境变量
#注意在PATH中,将$PYTHONPATH放到最前面,以防PATH先在其他地方寻找python(如果OS上有其他版本并存的话)
PYTHONPATH=/data/flowuser/python/bin:/data/flowuser/python/lib/python3.6/site-packages
PATH=$PYTHONPATH:$PATH:$HOME/.local/bin:$HOME/bin
#加上环境变量,否则会报找不到libpython3.6m.so.1.0
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/data/flowuser/python/lib

#升级pip,否则会报pip当前版本无效
pip3 install --upgrade pip

#-------------------airflow相关模块下载-------------------#
#下载airflow所需所有的python模块
pip3 download apache-airflow[all]

#配置airflow需要的环境变量,注意加载
export AIRFLOW_HOME=~/airflow
export AIRFLOW_GPL_UNIDECODE=yes

#安装airfow,在airflow的下载目录中
pip3 install apache-airflow[all] --no-index -f ./

#初始化库
airflow initdb

#尝试启动
airflow webserver -p 端口

实际的安装过程

实际的安装过程很坎坷,好在安装成功。这次安装的完整过程记录如下:
(步骤step:表示安装主顺序,error表示发生错误,solution表示错误解决方案,重复步骤retry-step表示重复某step的执行指令。)

python的安装

步骤1:获取python

wget https://www.python.org/ftp/python/3.6.5/Python-3.6.5.tgz

步骤2:编译python

由于一台机器有多个应用,并为后续方便迁移,不能将本次python安装在默认目录。因此在编译时需要指定目录:

#注意先mkdir $HOME/python 再cd到安装包的解压后的目录,然后编译:
./configure --prefix=/data/flowuser/python
报错1: no acceptable C compiler found in $PATH
configure: error: in /data/flowuser/sourcepkg/Python-3.6.5':
configure: error: no acceptable C compiler found in $PATH
see `config.log' for more details
报错1-解决方案:安装gcc
#使用root用户安装gc
yum install gcc
重试步骤2:编译python,configure 编译配置成功

步骤3:安装python

#回到应用用户, cd到安装包的解压后的目录,然后执行:
make
make install
报错2: zipimport.ZipImportError: can’t decompress data; zlib not available

在make install时候报错:

Traceback (most recent call last):
File “/data/flowuser/sourcepkg/Python-3.6.5/Lib/runpy.py”, line 193, in _run_module_as_main
main”, mod_spec)
File “/data/flowuser/sourcepkg/Python-3.6.5/Lib/runpy.py”, line 85, in _run_code
exec(code, run_globals)
File “/data/flowuser/sourcepkg/Python-3.6.5/Lib/ensurepip/main.py”, line 5, in
sys.exit(ensurepip._main())
File “/data/flowuser/sourcepkg/Python-3.6.5/Lib/ensurepip/init.py”, line 204, in _main
default_pip=args.default_pip,
File “/data/flowuser/sourcepkg/Python-3.6.5/Lib/ensurepip/init.py”, line 117, in _bootstrap
return _run_pip(args + [p[0] for p in _PROJECTS], additional_paths)
File “/data/flowuser/sourcepkg/Python-3.6.5/Lib/ensurepip/init.py”, line 27, in _run_pip
import pip
zipimport.ZipImportError: can’t decompress data; zlib not available
make: *** [install] Error 1

报错2-解决方案:编译&安装zlib
#注意,这个操作可能需要root用户权限
#先在python的解压目录下,进入Modules/zlib ,
cd ./Modules/zlib
#在这个目录下执行:
./configure  
make install 
重试步骤3

重新执行step3时,只要执行make install即可。


步骤4:设置环境变量

本次安装并未安装在默认的目录下,所以注意设置环境变量。
本次python的安装路径为–prefix=/data/flowuser/python,因此需要设置几个地方:

a. 在安装目录下面建立软连接:

ln -s /data/flowuser/python/bin/python3 /data/flowuser/python/bin/python

b. 在.bash_profile中添加环境变量

#注意在PATH中,将$PYTHONPATH放到最前面,以防PATH先在其他地方寻找python(如果OS上有其他版本并存的话)
PYTHONPATH=/data/flowuser/python/bin:/data/flowuser/python/lib/python3.6/site-packages
PATH=$PYTHONPATH:$PATH:$HOME/.local/bin:$HOME/bin

(至此,貌似python的安装已经结束,接下来安装airflow。实际上,python过会儿还得重装!!!)


airflow的安装

按前言所叙,本次安装其实是为了获取airflow所依赖的各个模块、以及作为在xx区离线安装airflow。因此,这里的airflow用的是离线安装。

步骤A:下载airflow

#下载airflow所需所有的python模块
pip3 download apache-airflow[all]
错误A: however the ssl module in Python is not available

pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.

错误A-方案:编译python时加上enable-shared参数
#重装python,编译时加上--enable-shared参数
./configure --prefix=/data/flowuser/python --enable-shared
make && make install
重试步骤A
错误A-1:libpython3.6m.so.1.0: cannot open shared object file: No such file or directory

/data/flowuser/python/bin/python3.6: error while loading shared libraries: libpython3.6m.so.1.0: cannot open shared object file: No such file or directory.

错误A-1-方案:加上环境变量LD_LIBRARY_PATH

推荐在用户home目录下bash_profiles文件配置,然后source
vim ~/.bash_profiles
source ~/.bash_profiles


#加上如下环境变量,注意加载
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/data/flowuser/python/lib
重试步骤A 又报错:
错误A:具体报错内容与错误A一模一样。
错误A-方案:安装openssl-devel
#检查openssl-devel安装情况
rpm -aq|grep openssl 
  openssl-libs-1.0.2k-8.el7.x86_64
  openssl-1.0.2k-8.el7.x86_64
#上面发现没有,则下载安装(必须使用root)
yum install openssl-devel -y 
#再次检查
rpm -aq|grep openssl 
  openssl-devel-1.0.2k-16.el7.x86_64
  openssl-1.0.2k-16.el7.x86_64
  openssl-libs-1.0.2k-16.el7.x86_64
错误A::又又报错:
错误A2:具体报错内容与错误A一模一样。
错误A2-解决方案:重装python,编译时加上with-ssl参数
#又重装了python,这次加上了--with-ssl
./configure --prefix=/data/flowuser/python --enable-shared --with-ssl
make
make install
重试步骤A-报错:
错误A3:pip版本过低。

Command “python setup.py egg_info” failed with error code 1 in /tmp/pip-build-r_34wkyl/apache-airflow/
You are using pip version 9.0.3, however version 19.0.1 is available.
You should consider upgrading via the ‘pip install --upgrade pip’ command.

解决方案A3:升级pip
pip3 install --upgrade pip
重试步骤A - 又又又又报错:
错误A4: RuntimeError: By default one of Airflow’s dependencies installs a GPL dependency (unidecode)

File was already downloaded /data/flowuser/sourcepkg/getairflow/apache-airflow-1.10.2.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File “”, line 1, in
File “/tmp/pip-download-b3963a08/apache-airflow/setup.py”, line 429, in
do_setup()
File “/tmp/pip-download-b3963a08/apache-airflow/setup.py”, line 287, in do_setup
verify_gpl_dependency()
File “/tmp/pip-download-b3963a08/apache-airflow/setup.py”, line 53, in verify_gpl_dependency
raise RuntimeError("By default one of Airflow’s dependencies installs a GPL "
** RuntimeError: By default one of Airflow’s dependencies installs a GPL dependency (unidecode). To avoid this dependency set SLUGIFY_USES_TEXT_UNIDECODE=yes in your environment when you install or upgrade Airflow. To force installing the GPL version set AIRFLOW_GPL_UNIDECODE**

解决方案A4:配置airflow相关环境变量
#新增环境变量,注意加载
export AIRFLOW_HOME=~/airflow
export AIRFLOW_GPL_UNIDECODE=yes
重试步骤A:又又又又又报错:
错误A5:OSError: mysql_config not found:

Saved ./mysqlclient-1.4.1.tar.gz
Complete output from command python setup.py egg_info:
/bin/sh: mysql_config: command not found
Traceback (most recent call last):
File “”, line 1, in
File “/tmp/pip-download-b1d3gupy/mysqlclient/setup.py”, line 16, in
metadata, options = get_config()
File “/tmp/pip-download-b1d3gupy/mysqlclient/setup_posix.py”, line 51, in get_config
libs = mysql_config(“libs”)
File “/tmp/pip-download-b1d3gupy/mysqlclient/setup_posix.py”, line 29, in mysql_config
** raise EnvironmentError("%s not found" % (_mysql_config_path,))
OSError: mysql_config not found**

解决方案A5:安装OS缺少的mysql相关组件
#注意使用root用户
yum install mysql-devel gcc gcc-devel python-devel

重试步骤A:终于,下载airflow依赖成功!

步骤2:安装airflow

#在airflow的下载目录中,执行
pip3 install apache-airflow[all] --no-index -f ./
errorB: gcc: error trying to exec ‘cc1plus’: execvp: No such file or directory

Running setup.py install for** JPype1 … error**
Complete output from command /data/flowuser/python/bin/python3.6 -u -c “import setuptools, tokenize;file=’/tmp/pip-install-56x6guf7/JPype1/setup.py’;f=getattr(tokenize, ‘open’, open)(file);code=f.read().replace(’\r\n’, ‘\n’);f.close();exec(compile(code, file, ‘exec’))” install --record /tmp/pip-record-3zwi9q_v/install-record.txt --single-version-externally-managed --compile:

running build_ext
** /tmp/pip-install-56x6guf7/JPype1/setup.py:173: FeatureNotice: Turned ON Numpy support for fast Java array access**
** FeatureNotice)
building ‘_jpype’ extension**
** creating build/temp.linux-x86_64-3.6**
** creating build/temp.linux-x86_64-3.6/native**
** creating build/temp.linux-x86_64-3.6/native/common**
** creating build/temp.linux-x86_64-3.6/native/python**
** gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -DHAVE_NUMPY=1 -Inative/common/include -Inative/python/include -Inative/jni_include -I/data/flowuser/python/lib/python3.6/site-packages/numpy/core/include -I/data/flowuser/python/include/python3.6m -c native/common/jp_array.cpp -o build/temp.linux-x86_64-3.6/native/common/jp_array.o -ggdb**
** gcc: error trying to exec ‘cc1plus’: execvp: No such file or directory**
** error: command ‘gcc’ failed with exit status 1**

解决方案B:安装gcc-c++
#注意使用root用户
yum install gcc-c++
#另外,刚才安装airflow的时候还有两个警告,也顺便把它解决了:
#google-cloud-spanner 1.7.1 has requirement google-cloud-core<0.30dev,>=0.29.0, but you'll have google-cloud-core 0.28.1 which is incompatible.
#google-cloud-bigquery 1.8.1 has requirement google-cloud-core<0.30dev,>=0.29.0, but you'll have google-cloud-core 0.28.1 which is incompatible.
wget https://files.pythonhosted.org/packages/0c/f2/3c225e7a69cb27d283b68bff867722bd066bc1858611180197f711815ea5/google_cloud_core-0.29.1-py2.py3-none-any.whl
重试步骤B 又报错:
错误B0:fatal error: sasl/sasl.h: No such file or directory

Running setup.py install for sasl … error
Complete output from command /data/flowuser/python/bin/python3.6 -u -c “import setuptools, tokenize;file=’/tmp/pip-install-0l2m6t4c/sasl/setup.py’;f=getattr(tokenize, ‘open’, open)(file);code=f.read().replace(’\r\n’, ‘\n’);f.close();exec(compile(code, file, ‘exec’))” install --record /tmp/pip-record-d9j9kpz5/install-record.txt --single-version-externally-managed --compile:

creating build/temp.linux-x86_64-3.6/sasl
** gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Isasl -I/data/flowuser/python/include/python3.6m -c sasl/saslwrapper.cpp -o build/temp.linux-x86_64-3.6/sasl/saslwrapper.o**
** cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]
In file included from sasl/saslwrapper.cpp:254:0:
sasl/saslwrapper.h:22:23: fatal error: sasl/sasl.h: No such file or directory**
** #include <sasl/sasl.h>
^
compilation terminated.
error: command ‘gcc’ failed with exit status 1**

解决方案B0:安装sasl相关组件
#注意使用root用户
yum install cyrus-sasl-lib.x86_64
yum install cyrus-sasl-devel.x86_64
yum install libgsasl-devel.x86_64
#yum install saslwrapper-devel.x86_64 #这个我看不用执行
重试步骤B:终于,安装airflow成功了

步骤3:初始化airflow数据库

airflow initdb
错误C:ModuleNotFoundError: No module name _sqlite3

Traceback (most recent call last):
File “/data/flowuser/python/lib/python3.6/site-packages/sqlalchemy/dialects/sqlite/pysqlite.py”, line 338, in dbapi
from pysqlite2 import dbapi2 as sqlite
ModuleNotFoundError: No module named 'pysqlite2’
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “/data/flowuser/python/bin/airflow”, line 21, in
from airflow import configuration
File “/data/flowuser/python/lib/python3.6/site-packages/airflow/init.py”, line 36, in
from airflow import settings, configuration as conf
File “/data/flowuser/python/lib/python3.6/site-packages/airflow/settings.py”, line 266, in
configure_orm()
File “/data/flowuser/python/lib/python3.6/site-packages/airflow/settings.py”, line 188, in configure_orm
engine = create_engine(SQL_ALCHEMY_CONN, **engine_args)
File “/data/flowuser/python/lib/python3.6/site-packages/sqlalchemy/engine/init.py”, line 431, in create_engine
return strategy.create(*args, **kwargs)
File “/data/flowuser/python/lib/python3.6/site-packages/sqlalchemy/engine/strategies.py”, line 87, in create
dbapi = dialect_cls.dbapi(**dbapi_args)
File “/data/flowuser/python/lib/python3.6/site-packages/sqlalchemy/dialects/sqlite/pysqlite.py”, line 343, in dbapi
raise e
File “/data/flowuser/python/lib/python3.6/site-packages/sqlalchemy/dialects/sqlite/pysqlite.py”, line 341, in dbapi
from sqlite3 import dbapi2 as sqlite # try 2.5+ stdlib name.
File “/data/flowuser/python/lib/python3.6/sqlite3/init.py”, line 23, in
from sqlite3.dbapi2 import *
File “/data/flowuser/python/lib/python3.6/sqlite3/dbapi2.py”, line 27, in
**from _sqlite3 import ***
ModuleNotFoundError: No module name _sqlite3

解决方案C:安装sqlite-devel相关组件,然后重新安装python
#注意使用root用户
yum install sqlite-devel
#然后重新安装python
./configure --prefix=/data/flowuser/python --enable-shared --with-ssl
重试步骤3:初始化库成功

步骤4:启动

airflow webserver -p 端口
  • 2
    点赞
  • 15
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值