编译Azkaban
-
环境准备:jdk版本1.8以上,git
-
下载:https://github.com/azkaban/azkaban/releases
-
上传服务器
-
解压
-
修改配置gradle
修改路径 vi gradle-wrapper.properties,并且把gradle-4.6-all.zip,放在当前路径,(不然要去主动下载,很费时间)
-
进入目录执行编译命令:./gradlew build installDist -x test
安装azkaban-solo-server
-
cd azkaban-3.81.0/azkaban-solo-server/build/distributions
-rw-rw-r-- 1 hadoop hadoop 36311594 Nov 21 20:21 azkaban-solo-server-0.1.0-SNAPSHOT.tar.gz -rw-rw-r-- 1 hadoop hadoop 36450077 Nov 21 20:21 azkaban-solo-server-0.1.0-SNAPSHOT.zip
-
随便解压一个到相应的目录:tar -zxvf azkaban-solo-server-0.1.0-SNAPSHOT.tar.gz -C ~/app/azkaban/azkaban-solo-server
-
cd ~/app/azkaban/azkaban-solo-server/bin
-
bin/start-solo.sh(不能进入到bin目录下,执行./start-solo.sh),默认web访问端口是8081
-
修改配置文件
vi conf/azkaban.properties azkaban.name=flyingjim azkaban.label=Azkaban #修改时区 default.timezone.id=Asia/Shanghai
-
修改用户名和登录密码
vi conf/azkaban-users.xml <azkaban-users> <user groups="azkaban" password="azkaban" roles="admin" username="azkaban"/> <user password="metrics" roles="metrics" username="metrics"/> <user password="feidata" roles="admin" username="feidata"/> <role name="admin" permissions="ADMIN"/> <role name="metrics" permissions="METRICS"/> </azkaban-users>
-
重启azkaban
配置依赖
-
命令
config: job_time: 2020-02-24 nodes: - name: statReport type: command config: command: sh stat.sh ${job_time} dependsOn: - etlReport - name: etlReport type: command config: command: sh etl.sh ${job_time}
其中全局配置job_time,可以使用如下配置动态传入
Create Flows
参考:https://azkaban.readthedocs.io/en/latest/createFlows.html#flow-2-0-basics
安装部署Multi Executor Server
https://azkaban.readthedocs.io/en/latest/getStarted.html#getting-started-with-the-multi-executor-server
-
执行命令:tar -zxvf azkaban-exec-server/build/distributions/azkaban-exec-server-0.1.0-SNAPSHOT.tar.gz -C ~/app/azkaban/;tar -zxvf azkaban-web-server/build/distributions/azkaban-web-server-0.1.0-SNAPSHOT.tar.gz
-
修改azkaban-exec-server的配置文件
vi conf/azkaban.properties azkaban.name=flyingjim azkaban.label=Azkaban #修改时区 default.timezone.id=Asia/Shanghai #修改存储数据库 database.type=mysql mysql.port=6619 mysql.host=localhost mysql.database=azkaban mysql.user=root mysql.password=mysqladminroot mysql.numconnections=100 # Azkaban Executor settings executor.maxThreads=50 executor.flow.threads=30
-
修改azkaban-web-server的配置文件
vi conf/azkaban.properties azkaban.name=flyingjim azkaban.label=Azkaban #修改时区 default.timezone.id=Asia/Shanghai #修改存储数据库 database.type=mysql mysql.port=6619 mysql.host=localhost mysql.database=azkaban mysql.user=root mysql.password=mysqladminroot mysql.numconnections=100 # Azkaban Executor settings executor.maxThreads=50 executor.flow.threads=30
-
加载sql:/home/hadoop/software/azkaban-3.81.0/azkaban-db/build/sql/create-all-sql-0.1.0-SNAPSHOT.sql 到mysql的azkaban库中,总共35张表
mysql5.6记得修改如下mysql> set global innodb_file_format = BARRACUDA; mysql> set global innodb_large_prefix = ON; mysql> show variables like 'innodb_large_prefix'; mysql> show variables like 'innodb_file_format'; 然后才可以执行下面命令创建其中的一个表 CREATE TABLE execution_jobs ( exec_id INT NOT NULL, project_id INT NOT NULL, version INT NOT NULL, flow_id VARCHAR(128) NOT NULL, job_id VARCHAR(512) NOT NULL, attempt INT, start_time BIGINT, end_time BIGINT, status TINYINT, input_params LONGBLOB, output_params LONGBLOB, attachments LONGBLOB, PRIMARY KEY (exec_id, job_id, flow_id, attempt) )row_format=DYNAMIC; CREATE TABLE execution_logs ( exec_id INT NOT NULL, name VARCHAR(640), attempt INT, enc_type TINYINT, start_byte INT, end_byte INT, log LONGBLOB, upload_time BIGINT, PRIMARY KEY (exec_id, name, attempt, start_byte) )row_format=DYNAMIC;
启动web会报如下错误(坑)
官网有句话:
After that, remember to activate the executor by calling:
cd azkaban-exec-server/build/install/azkaban-exec-server
curl -G “localhost:$(<./executor.port)/executor?action=activate” && echo
curl “http://hadoop02:53502/executor?action=activate”
其中的端口号要去executors表中查看
配置全局变量hadoop.home
Azkaban中调度mr作业的时候,使用的是hadoop命令的完整路径,要将命令改成${hadoop.home}/bin/hadoop这种方式提交,那么hadoop.home怎么配置才能生效
-
删除jobtypes
cd /home/hadoop/app/azkaban/azkaban-solo-server/plugins/jobtypes rm -rf *
-
复制源码中的jobtypes的commonprivate.properties到当前目录下
cp /home/hadoop/software/azkaban-3.81.0/az-hadoop-jobtype-plugin/src/jobtypes/commonprivate.properties /home/hadoop/app/azkaban/azkaban-solo-server/plugins/jobtypes cp /home/hadoop/software/azkaban-3.81.0/az-hadoop-jobtype-plugin/src/jobtypes/common.properties /home/hadoop/app/azkaban/azkaban-solo-server/plugins/jobtypes
vi commonprivate.properties vi common.properties # hadoop hadoop.home=/home/hadoop/app/hadoop