Reading Time: 3 minutes 阅读时间3分钟
Installation of Airflow (安装Airflow)
The more preferable approach to installing Apache-Airflow is to install it in a virtual environment. Airflow requires the latest version of PYTHON and PIP (package installer for python).
安装Apache-Airflow的更可取的方法是将其安装在虚拟环境中。Airflow需要最新版本的 PYTHON 和 PIP(用于Python的软件包安装程序)。
Below are the steps to install it on your system
#To set up a virtual environment for Apache Airflow :
$ virtualenv apache_airflow
To activate the virtual environment navigate to the “bin” folder inside the apache_airflow folder and activate it using the following command :
cd apache_airflow/bin
source activate
Next, we have to set the airflow home path :
export AIRFLOW_HOME=~/airflow
To install apache-airflow:
pip install apache-airflow
For Airflow to function properly we need to initialize a database:
airflow db init
The last step is to start the webserver for airflow:
最后一步是启动 Web 服务器以获取Airflow:
airflow webserver -p 8081
To verify if Airflow is successfully installed, access the localhost using the port number :
Creating a User in Apache Airflow
在Apache airflow中创建用户
To sign in to the Airflow dashboard we need to create a User. Go through the following steps to create a user using the Airflow command-line interface.
要登录到“Airflow”仪表板,我们需要创建一个用户。执行以下步骤以使用 Airflow 命令行界面创建用户。
To create a USER with Admin privileges in the Airflow database :
airflow users create -e -f John -l Doe -p admin -r Admin -u admin
Now that we have created an Admin user, Login into the dashboard using the credentials. Once we successfully login to the Airflow Dashboard we see all the data pipelines we have by default.
When we login in for the first time, we get a warning on the landing page that says “The scheduler does not appear to be running“. To start the airflow scheduler execute the following command and reload the landing page :
airflow scheduler
Access Control in Airflow
When we create a user in Airflow, we also have to define what role that user will be assigned. Airflow contains a set of predefined roles by default: Admin, User, Op, Viewer, and Public. Only an Admin user has control over configuring and altering permissions for other roles.
当我们在Airflow中创建用户时,我们还必须定义将为该用户分配的角色。默认情况下,Airflow 包含一组预定义的角色:Admin, User, Op, Viewer, and Public。只有Admin用户才能控制配置和更改其他角色的权限。
An Admin user will have all possible permissions including granting and revoking permissions from other users.
A Public user does not have any permission.
A Viewer user has restricted viewing permission.
A User has Viewer permissions and also some extra User permission.
An Op user has User permissions and extra Op permissions.
Op 用户具有用户权限和额外的操作权限。
Basic Commands for Apache Airflow
Apache Airflow 的基本命令
List all the DAGS that airflow brings by default:
列出Airflow默认带来的所有 DAGS:
airflow dags list
Check what tasks a DAG contains:
检查 DAG 包含哪些任务:
airflow tasks list example_xcom_args
Execute a data pipeline with a defined execution date:
airflow dags trigger -e 2022-02-02 example_xcom_args
Conclusion 结论
In this blog, we saw how to properly install Airflow on your system locally using the commands line interface. We also saw how to create the first user for the Airflow instance and what roles can a User have. Lastly, we went through some basic commands of Airflow.
在这篇博客中,我们了解了如何使用命令行界面在本地系统上正确安装 Airflow。我们还看到了如何为 Airflow 实例创建第一个用户,以及用户可以拥有哪些角色。最后,我们介绍了Airflow的一些基本命令。