该软件可分为三部分,server、executor以及sql表,在一台机器上搭建server,其他机器搭建executor,即可实现多节点集群化控制。
官方网址: https://azkaban.github.io/downloads.html ,按照其详细讲解下载即可
一、部署模式
- solo-server模式 (使用内置h2存储元数据);
-
two-server模式 (1个webServer,1个execServer在同一服务器上,使用mysql存储元数据);
- multiple-executor模式 (1个webServer,多个execServer分布在不同服务上,使用mysql存储元数据);
本文选择第三种模式,即一台机器安装webServer服务,多台机器安装execServer.这种模式是使用最广泛的。
二、部署准备
(1)服务器三台
master --azkaban-web-server
slave1 --azkaban-exec-server
slave2 --azkaban-exec-server
slave3 --azkaban-exec-server
(2)搭建MySQL
默认有一个mysql服务器,首先创建azkaban库,azkaban用户密码,并赋予远程连接。
mysql> CREATE DATABASE azkaban;
mysql> CREATE USER 'azkaban'@'%' IDENTIFIED BY 'azkaban';
mysql> CREATE USER 'azkaban'@'localhost' IDENTIFIED BY 'azkaban';
mysql> grant all privileges on azkaban.* to 'azkaban'@'%' identified by 'azkaban';
mysql> grant all privileges on azkaban.* to 'azkaban'@'localhost' identified by 'azkaban';
mysql> flush privileges;
三、开始部署
(1)解压缩文件azkaban-3.84.4.tar.gz
(2)进入目录执行下面代码编译源文件
./gradlew distTar
(3)将下面4个文件解压缩,文件如下:
azkaban-exec-server-3.84.4.tar.gz | server |
azkaban-web-server-3.84.4.tar.gz | executor |
(4)导入建表语句
mysql -uroot -proot123
> SOURCE /opt/azkaban-3.47.0/azkaban-db/build/distributions/azkaban-db-0.1.0-SNAPSHOT/create-all-sql-0.1.0-SNAPSHOT.sql;
(5)构建Azkaban-web-server
- 配置jetty ssl 要记住设置的密码,这里密码统一设置为123456
到server目录下执行:keytool -keystore keystore -alias jetty -genkey -keyalg RSA
Enter keystore password: 输入密码
Re-enter new password: 再次输入密码
What is your first and last name?
[Unknown]: 直接回车
What is the name of your organizational unit?
[Unknown]: 直接回车
What is the name of your organization?
[Unknown]: 直接回车
What is the name of your City or Locality?
[Unknown]: 直接回车
What is the name of your State or Province?
[Unknown]: 直接回车
What is the two-letter country code for this unit?
[Unknown]: 直接回车
Is CN=YY, OU=YY, O=YY, L=shanghai, ST=shanghai, C=CN correct?
[no]: y
- 编辑 server/conf/azkaban.properties 文件
# Azkaban Personalization Settings
azkaban.name=group_env
azkaban.label=My Local Azkaban
azkaban.color=#FF3601
azkaban.default.servlet.path=/index
web.resource.dir=/home/public/azkaban/server/web/
default.timezone.id=Asia/Shanghai
# Azkaban UserManager class
user.manager.class=azkaban.user.XmlUserManager
user.manager.xml.file=/home/public/azkaban/server/conf/azkaban-users.xml
# Loader for projects
executor.global.properties=/home/public/azkaban/server/conf/global.properties
azkaban.project.dir=projects
# Velocity dev mode
velocity.dev.mode=false
# Azkaban Jetty server properties.
#访问链接要加https
jetty.use.ssl=true
jetty.maxThreads=25
jetty.ssl.port=8443
jetty.port=8082
#绝对路径
jetty.keystore=/home/public/azkaban/server/keystore
jetty.password=123456
jetty.keypassword=123456
jetty.truststore=/home/public/azkaban/server/keystore
jetty.trustpassword=123456
# Azkaban Executor settings
# mail settings
mail.sender=
mail.host=
# User facing web server configurations used to construct the user facing server URLs. They are useful when there is a reverse proxy between Azkaban web servers and users.
# enduser -> myazkabanhost:443 -> proxy -> localhost:8081
# when this parameters set then these parameters are used to generate email links.
# if these parameters are not set then jetty.hostname, and jetty.port(if ssl configured jetty.ssl.port) are used.
# azkaban.webserver.external_hostname=myazkabanhost.com
# azkaban.webserver.external_ssl_port=443
# azkaban.webserver.external_port=8081
job.failure.email=
job.success.email=
lockdown.create.projects=false
cache.directory=cache
# JMX stats
jetty.connector.stats=true
executor.connector.stats=true
# Azkaban mysql settings by default. Users should configure their own username and password.
database.type=mysql
mysql.port=3306
mysql.host=159.226.16.181
mysql.database=azkaban
mysql.user=azkaban
mysql.password=azkaban@icR2AB
mysql.numconnections=100
#Multiple Executor
azkaban.jobtype.plugin.dir=plugins/jobtypes
azkaban.use.multiple.executors=true
azkaban.executorselector.filters=StaticRemainingFlowSize,MinimumFreeMemory,CpuStatus
azkaban.executorselector.comparator.NumberOfAssignedFlowComparator=1
azkaban.executorselector.comparator.Memory=1
azkaban.executorselector.comparator.LastDispatched=1
azkaban.executorselector.comparator.CpuUsage=1
- 编辑server/conf/azkaban-users.xml文件
<azkaban-users>
<user groups="azkaban" password="azkaban" roles="admin" username="azkaban"/>
<user password="metrics" roles="metrics" username="metrics"/>
<user username="admin" password="admin" roles="admin"/>
<role name="admin" permissions="ADMIN"/>
<role name="metrics" permissions="METRICS"/>
</azkaban-users>
- 修改/home/public/azkaban/server/lib下的mysql驱动,要求符合所连接数据库的版本
- 修改server目录下plugins/jobtypes/commonprivate.properties文件
azkaban.native.lib=false
execute.as.user=false
(6)构建Azkaban-exec-server
- 编辑 server/conf/azkaban.properties 文件
# Azkaban Personalization Settings
azkaban.name=Test
azkaban.label=My Local Azkaban
azkaban.color=#FF3601
azkaban.default.servlet.path=/index
web.resource.dir=web/
default.timezone.id=Asia/Shanghai
# Azkaban UserManager class
user.manager.class=azkaban.user.XmlUserManager
user.manager.xml.file=conf/azkaban-users.xml
# Loader for projects
executor.global.properties=conf/global.properties
azkaban.project.dir=projects
# Velocity dev mode
velocity.dev.mode=false
# Azkaban Jetty server properties.
jetty.use.ssl=false
jetty.maxThreads=25
jetty.port=8081
# Where the Azkaban web server is located
azkaban.webserver.url=http://localhost:8081
# mail settings
mail.sender=
mail.host=
# User facing web server configurations used to construct the user facing server URLs. They are useful when there is a reverse proxy between Azkaban web servers and users.
# enduser -> myazkabanhost:443 -> proxy -> localhost:8081
# when this parameters set then these parameters are used to generate email links.
# if these parameters are not set then jetty.hostname, and jetty.port(if ssl configured jetty.ssl.port) are used.
# azkaban.webserver.external_hostname=myazkabanhost.com
# azkaban.webserver.external_ssl_port=443
# azkaban.webserver.external_port=8081
job.failure.email=
job.success.email=
lockdown.create.projects=false
cache.directory=cache
# JMX stats
jetty.connector.stats=true
executor.connector.stats=true
# Azkaban plugin settings
azkaban.jobtype.plugin.dir=plugins/jobtypes
# Azkaban mysql settings by default. Users should configure their own username and password.
database.type=mysql
mysql.port=3306
mysql.host=159.226.16.181
mysql.database=azkaban
mysql.user=azkaban
mysql.password=azkaban@icR2AB
mysql.numconnections=100
# Azkaban Executor settings
executor.maxThreads=50
executor.flow.threads=30
- 修改/home/public/azkaban/executor/lib下的mysql驱动,要求符合所连接数据库的版本
- 修改executor/目录下plugins/jobtypes/commonprivate.properties文件
azkaban.native.lib=false
execute.as.user=false
四、Azkaban运行
(1)先启动azkaban执行器:分别在executor目录下执行,启动日志写到了logs目录。
./bin/start-exec.sh
激活执行器
curl http://ip:12321/executor?action=activate
(2)再启动azkaban服务器
./bin/start-web.sh
(3)浏览器地址 https://server_ip:8443查看服务界面,访问密码在azkaban-users.xml 里,可自行配置。
五、激活执行器
将MySQL中azkaban数据库中的executors激活,active若为0,要更新为1,表示激活执行器。
补充:
jetty.use.ssl=true,设置为true,访问链接要加https。
可参考:借鉴
六、执行任务
(1)创建任务脚本
在192.168.1.11(znzd002)的home/mntc目录下创建脚本文件test.sh,内容如下:
#!/bin/bash
echo 'Hello World'
(2)创建azkaban的任务文件test.job,并打包为test.zip。
azkaban集群模式下要指定任务执行器,脚本在那个机器上就配置该机器的azkaban执行器ID, 查表executors,这里znzd002对应的执行器ID是10。
type=command
command=sh /home/dmbigdata/mntc/test.sh
#配置执行器id
useExecutor=10
retries=3
retry.backoff=30000
(3)上传test.zip后,Job内容明细如下
(4)执行test.job,Run Job
(5)指定用哪个azkaban执行器,要在Flow中设置userExecutor参数,参数值指表executors中的执行器的ID。
(6)执行Schedule,并设置任务执行时间,比如每天2点执行一次
(7)查看认为日志
(8)日志明细