Reading Time: 3 minutes 阅读时间3分钟
Installation of Airflow (安装Airflow)
The more preferable approach to installing Apache-Airflow is to install it in a virtual environment. Airflow requires the latest version of PYTHON and PIP (package installer for python).
安装Apache-Airflow的更可取的方法是将其安装在虚拟环境中。Airflow需要最新版本的 PYTHON 和 PIP(用于Python的软件包安装程序)。
Below are the steps to install it on your system
以下是将其安装在系统上的步骤
#To set up a virtual environment for Apache Airflow :
$ virtualenv apache_airflow
To activate the virtual environment navigate to the “bin” folder inside the apache_airflow folder and activate it using the following command :
要激活虚拟环境,请导航到apache_airflow/bin
文件夹,然后使用以下命令将其激活:
cd apache_airflow/bin
source activate
Next, we have to set the airflow home path :
接下来,我们要设置Airflow主路径:
export AIRFLOW_HOME=~/airflow
To install apache-airflow:
要安装Airflow:
pip install apache-airflow
For Airflow to function properly we need to initialize a database:
为了使Airflow正常工作,我们需要初始化一个数据库:
airflow db init
The last step is to start the webserver for airflow:
最后一步是启动 Web 服务器以获取Airflow:
airflow webserver -p 8081
To verify if Airflow is successfully installed, access the localhost using the port number :
要验证Airflow是否已成功安装,请使用端口号访问本地主机:
http://localhost:8081/
Creating a User in Apache Airflow
在Apache airflow中创建用户
To sign in to the Airflow dashboard we need to create a User. Go through the following steps to create a user using the Airflow command-line interface.
要登录到“Airflow”仪表板,我们需要创建一个用户。执行以下步骤以使用 Airflow 命令行界面创建用户。
To create a USER with Admin privileges in the Airflow database :
要在“Airflow”数据库中创建具有管理员权限的用户:
airflow users create -e admin@example.org -f John -l Doe -p admin -r Admin -u admin
Now that we have created an Admin user, Login into the dashboard using the credentials. Once we successfully login to the Airflow Dashboard we see all the data pipelines we have by default.
现在我们已经创建了一个管理员用户,请使用凭据登录到仪表板。成功登录到“气流仪表板”后,我们会看到默认情况下拥有的所有数据管道。
When we login in for the first time, we get a warning on the landing page that says “The scheduler does not appear to be running“. To start the airflow scheduler execute the following command and reload the landing page :
当我们首次登录时,我们会在登录页面上收到一条警告,指出“Scheduler程序似乎没有运行”。要启动Airflow调度程序,请执行以下命令并重新加载登录页面:
airflow scheduler
Access Control in Airflow
Airflow中的访问控制
When we create a user in Airflow, we also have to define what role that user will be assigned. Airflow contains a set of predefined roles by default: Admin, User, Op, Viewer, and Public. Only an Admin user has control over configuring and altering permissions for other roles.
当我们在Airflow中创建用户时,我们还必须定义将为该用户分配的角色。默认情况下,Airflow 包含一组预定义的角色:Admin, User, Op, Viewer, and Public。只有Admin用户才能控制配置和更改其他角色的权限。
Admin
An Admin user will have all possible permissions including granting and revoking permissions from other users.
管理员用户将拥有所有可能的权限,包括授予和撤消其他用户的权限。
Public
A Public user does not have any permission.
公共用户没有任何权限。
Viewer
A Viewer user has restricted viewing permission.
查看者用户具有受限的查看权限。
User
A User has Viewer permissions and also some extra User permission.
用户具有查看者权限以及一些额外的用户权限。
Op
An Op user has User permissions and extra Op permissions.
Op 用户具有用户权限和额外的操作权限。
Basic Commands for Apache Airflow
Apache Airflow 的基本命令
List all the DAGS that airflow brings by default:
列出Airflow默认带来的所有 DAGS:
airflow dags list
Check what tasks a DAG contains:
检查 DAG 包含哪些任务:
airflow tasks list example_xcom_args
Execute a data pipeline with a defined execution date:
执行具有定义执行日期的数据管道:
airflow dags trigger -e 2022-02-02 example_xcom_args
Conclusion 结论
In this blog, we saw how to properly install Airflow on your system locally using the commands line interface. We also saw how to create the first user for the Airflow instance and what roles can a User have. Lastly, we went through some basic commands of Airflow.
在这篇博客中,我们了解了如何使用命令行界面在本地系统上正确安装 Airflow。我们还看到了如何为 Airflow 实例创建第一个用户,以及用户可以拥有哪些角色。最后,我们介绍了Airflow的一些基本命令。