Azkaban是由LinkedIn开发的调度工具,可以用于调度Hadoop中的相互依赖的Job。有时候,在hadoop集群中运行的Job是相互依赖的,某些任务需要顺序的执行,这种场景下使用Azkaban能够很好的解决问题。
Azkaban有三个重要的组件构成:
关系型数据库(MySQL)
AzkabanWebServer
AzkabanExecutorServer
三个组件之间的关系如下:
准备
- 1.搭建好的Hadoop分布式集群
- 2.
azkaban-executor-server-2.5.0.tar.gz
azkaban-sql-script-2.5.0.tar.gz
azkaban-web-server-2.5.0.tar.gz
安装
#mkdir -p /usr/local/azkaban
#tar -zxvf azkaban-executor-server-2.5.0.tar.gz -C /usr/local/azkaban
#tar -zxvf azkaban-sql-script-2.5.0.tar.gz -C /usr/local/azkaban
#tar -zxvf azkaban-web-server-2.5.0.tar.gz -C /usr/local/azkaban
配置
- 1.创建azkaban数据库
[root@localhost ~]# mysql -u root -p
Enter password:
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 6165
Server version: 5.5.52-MariaDB MariaDB Server
Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> create database azkaban;
Query OK, 1 row affected (0.00 sec)
MariaDB [(none)]>
- 2.将数据表导入azkaban数据库
MariaDB [(none)]> use azkaban;
Database changed
MariaDB [azkaban]> source /usr/local/azkaban/azkaban-2.5.0/create-all-sql-2.5.0.sql
Query OK, 0 rows affected (0.01 sec)
Query OK, 0 rows affected (0.01 sec)
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.02 sec)
Records: 0 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected (0.00 sec)
Records: 0 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.02 sec)
Records: 0 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.02 sec)
Records: 0 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.02 sec)
Records: 0 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected (0.01 sec)
Query OK, 0 rows affected (0.00 sec)
Records: 0 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected (0.01 sec)
Query OK, 0 rows affected (0.03 sec)
Records: 0 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.02 sec)
Query OK, 0 rows affected (0.00 sec)
Records: 0 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected (0.01 sec)
MariaDB [azkaban]> show tables;
+------------------------+
| Tables_in_azkaban |
+------------------------+
| active_executing_flows |
| active_sla |
| execution_flows |
| execution_jobs |
| execution_logs |
| project_events |
| project_files |
| project_flows |
| project_permissions |
| project_properties |
| project_versions |
| projects |
| properties |
| schedules |
| triggers |
+------------------------+
15 rows in set (0.00 sec)
MariaDB [azkaban]>
- 3.创建SSL配置(生成安全证书)
[root@Master ~]# keytool -keystore keystore -alias jetty -genkey -keyalg RSA
Enter keystore password:
Re-enter new password:
What is your first and last name?
[Unknown]:
What is the name of your organizational unit?
[Unknown]:
What is the name of your organization?
[Unknown]:
What is the name of your City or Locality?
[Unknown]:
What is the name of your State or Province?
[Unknown]:
What is the two-letter country code for this unit?
[Unknown]: CN
Is CN=Unknown, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, C=CN correct?
[no]: yes
Enter key password for <jetty>
(RETURN if same as keystore password):
验证是否成功生成了keystore证书文件(证书文件存在于执行命令的目录)
[root@Master ~]# ls
anaconda-ks.cfg keystore
[root@Master ~]#
- 4.拷贝证书文件(keystore)到azkaban-web-2.5.0根目录下
[root@Master ~]# cp keystore /usr/local/azkaban/azkaban-web-2.5.0/
[root@Master ~]#
5.使用tzselect同步时区
6.修改配置文件
/azkaban-web-2.5.0/conf/azkaban.properties
# Azkaban Personalization Settings
azkaban.name=Test //服务器UI名称
azkaban.label=My Local Azkaban //描述
azkaban.color=#FF3601 //UI颜色
azkaban.default.servlet.path=/index
web.resource.dir=web/ //默认web根目录
default.timezone.id=Asia/Shanghai //修改默认时区
# Azkaban UserManager class
user.manager.class=azkaban.user.XmlUserManager
user.manager.xml.file=conf/azkaban-users.xml
# Loader for projects
executor.global.properties=conf/global.properties
azkaban.project.dir=projects
database.type=mysql //数据库类型
mysql.port=3306 //端口号
mysql.host=123.207.101.174 //数据库连接地址
mysql.database=azkaban //数据库实例名
mysql.user=user //数据库用户名
mysql.password=password //数据库密码
mysql.numconnections=100 //最大连接数
# Velocity dev mode
velocity.dev.mode=false
# Azkaban Jetty server properties.
jetty.maxThreads=25 //最大线程数
jetty.ssl.port=8443 //Jetty SSL端口
jetty.port=8081 //Jrtty端口
jetty.keystore=keystore //SSL文件名
jetty.password=246437 //SSL文件密码
jetty.keypassword=246437 //Jetty主密码与keystore文件相同
jetty.truststore=keystore //SSL文件名
jetty.trustpassword=246437 //SSL文件密码
# Azkaban Executor settings
executor.port=12321 //执行服务端口号
# mail settings
mail.sender=
mail.host=
job.failure.email=
job.success.email=
lockdown.create.projects=false
cache.directory=cache //缓存目录
/azkaban-web-2.5.0/conf/azkaban-users.xml
<azkaban-users>
<user username="azkaban" password="azkaban" roles="admin" groups="azkaban" />
<user username="metrics" password="metrics" roles="metrics" />
//添加的记录
<user username="admin" password="admin" roles="admin,metrics" />
<role name="admin" permissions="ADMIN" />
<role name="metrics" permissions="METRICS" />
</azkaban-users>
/azkaban-executor-2.5.0/conf/azkaban.properties
# Azkaban
default.timezone.id=Asia/Shanghai
# Azkaban JobTypes Plugins
azkaban.jobtype.plugin.dir=plugins/jobtypes
# Loader for projects
executor.global.properties=conf/global.properties
azkaban.project.dir=projects
database.type=mysql
mysql.port=3306
mysql.host=looc
mysql.database=azkaban
mysql.user=root
mysql.password=246437
mysql.numconnections=100
# Azkaban Executor settings
executor.maxThreads=50
executor.port=12321
executor.flow.threads=30
验证
- 1.启动相应服务
启动AzkabanWebServer
[root@Master azkaban-web-2.5.0]# bin/azkaban-web-start.sh
注:停止bin/azkaban-web-shutdown.sh
启动AzkabanExecutorServer
[root@Master azkaban-executor-2.5.0]# bin/azkaban-executor-start.sh
注:停止bin/azkaban-executor-shutdown.sh
验证启动结果:
[root@Master ~]# jps
3047 ResourceManager
3415 Jps
3385 AzkabanExecutorServer
2891 SecondaryNameNode
2525 QuorumPeerMain
2701 NameNode
3357 AzkabanWebServer
[root@Master ~]#
- 2.访问Azkaban Web UI
地址:https://master:8443
注:地址中必须是https,另外用户名和密码都是admin.
实例