azkaban实例

本文详细介绍了如何使用Azkaban执行不同类型的工作流,包括单job、多job有依赖关系的流程,HDFS操作任务以及MapReduce任务。通过创建.job文件,打包上传到Azkaban的web管理平台进行执行,展示了Azkaban在调度和管理任务中的应用。
摘要由CSDN通过智能技术生成

1、Command 类型之单 job 工作流案例
在本地创建后缀名为.job的文本文件command.job
文本文件内容为:

#command.job
type=command
command=echo 'hello'

在把文件打成zip压缩包
在这里插入图片描述
登录到https://hdp-3:8443,通过 azkaban 的 web 管理平台创建 project 并上传 job 压缩包
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
job1完成!
2、Command 类型之多 job 工作流案例
创建有依赖关系的多个 job 描述
新建文本文件foo.job

type=command
command=echo foo

在新建一个文本文件bar.job,并依赖与foo.job

type=command
dependencies=foo
command=echo bar

把两个文本文件打包成一个.zip包
按照上面的步骤:通过 azkaban 的 web 管理平台创建 project 并上传 job 压缩包
结果:
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
3、HDFS 操作任务
创建 job 描述文件fs.job

type=command
command=hadoop fs -mkdir /azkaban

通过 azkaban 的 web 管理平台创建 project 并上传 job 压缩包
启动执行该 job
查看结果:
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
4、MapReduce 任务
Mr 任务依然可以使用 command 的 job 类型来执行
准备工作:在hdfs的azkaban目录下创建input,并在input下上传文件azkabanmrwc.data
在linux下hdp-3根目录下创建文件azkabanmrwc.data

[root@hdp-3 ~]# vi azkabanmrwc.data

内容:

1 blue 20
2 yellow 25
3 red 18
4 blacke 10
5 orange 15
6 white 23
7 green 9

方式一:在linux下输入命令
在hdfs上新建input路径:

hadoop fs -mkdir /azkaban/input

上传azkabanmrwc.data文件到input下:

[root@hdp-3 ~]# hadoop fs -put /root/azkabanmrwc.data /azkaban/input/

方式二:
新建shangchuan1.job(功能:创建input)

type=command
command=hadoop fs -mkdir /azkaban/input

新建shuangchuan2.job(功能:上传文件azkabanmrwc.data )

type=command
dependencies=fs2
command=hadoop fs -put /root/azkabanmrwc.data /azkaban/input/

打成zip压缩包,通过 azkaban 的 web 管理平台创建 project 并上传 job 压缩包运行即可!

创建 job 描述文件,及 mr 程序 jar 包(示例中直接使用 hadoop 自带的 example jar)
mrwc.job

type=command
command=hadoop jar hadoop-mapreduce-examples-2.8.1.jar wordcount /azkaban/input/azkabanmrwc.data /azkaban/output

与hadoop-mapreduce-examples-2.8.1.jar打成压缩包
在这里插入图片描述
放到azkaban上运行:
中间查看运行过程时会有mapreduce过程:
在这里插入图片描述
成功结果:

05-12-2019 14:13:26 CST mrwc INFO - Starting job mrwc at 1575526406250
05-12-2019 14:13:26 CST mrwc INFO - Building command job executor. 
05-12-2019 14:13:26 CST mrwc INFO - 1 commands to execute.
05-12-2019 14:13:26 CST mrwc INFO - Command: hadoop jar hadoop-mapreduce-examples-2.8.1.jar wordcount /azkaban/input/azkabanmrwc.data /azkaban/output
05-12-2019 14:13:26 CST mrwc INFO - Environment variables: {JOB_OUTPUT_PROP_FILE=/root/apps/azkaban/azkaban-executor-2.5.0/executions/9/mrwc_output_4723153487441072185_tmp, JOB_PROP_FILE=/root/apps/azkaban/azkaban-executor-2.5.0/executions/9/mrwc_props_3535293413267978868_tmp, JOB_NAME=mrwc}
05-12-2019 14:13:26 CST mrwc INFO - Working directory: /root/apps/azkaban/azkaban-executor-2.5.0/executions/9
05-12-2019 14:13:28 CST mrwc ERROR - 19/12/05 14:13:28 INFO client.RMProxy: Connecting to ResourceManager at hdp-1/192.168.150.151:8032
05-12-2019 14:13:29 CST mrwc ERROR - 19/12/05 14:13:29 INFO input.FileInputFormat: Total input files to process : 1
05-12-2019 14:13:29 CST mrwc ERROR - 19/12/05 14:13:29 INFO mapreduce.JobSubmitter: number of splits:1
05-12-2019 14:13:30 CST mrwc ERROR - 19/12/05 14:13:30 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1575510666483_0005
05-12-2019 14:13:30 CST mrwc ERROR - 19/12/05 14:13:30 INFO impl.YarnClientImpl: Submitted application application_1575510666483_0005
05-12-2019 14:13:30 CST mrwc ERROR - 19/12/05 14:13:30 INFO mapreduce.Job: The url to track the job: http://hdp-1:8088/proxy/application_1575510666483_0005/
05-12-2019 14:13:30 CST mrwc ERROR - 19/12/05 14:13:30 INFO mapreduce.Job: Running job: job_1575510666483_0005
05-12-2019 14:13:37 CST mrwc ERROR - 19/12/05 14:13:37 INFO mapreduce.Job: Job job_1575510666483_0005 running in uber mode : false
05-12-2019 14:13:37 CST mrwc ERROR - 19/12/05 14:13:37 INFO mapreduce.Job:  map 0% reduce 0%
05-12-2019 14:13:45 CST mrwc ERROR - 19/12/05 14:13:45 INFO mapreduce.Job:  map 100% reduce 0%
05-12-2019 14:13:51 CST mrwc ERROR - 19/12/05 14:13:51 INFO mapreduce.Job:  map 100% reduce 100%
05-12-2019 14:13:51 CST mrwc ERROR - 19/12/05 14:13:51 INFO mapreduce.Job: Job job_1575510666483_0005 completed successfully
05-12-2019 14:13:51 CST mrwc ERROR - 19/12/05 14:13:51 INFO mapreduce.Job: Counters: 49
05-12-2019 14:13:52 CST mrwc ERROR - 	File System Counters
05-12-2019 14:13:52 CST mrwc ERROR - 		FILE: Number of bytes read=208
05-12-2019 14:13:52 CST mrwc ERROR - 		FILE: Number of bytes written=272765
05-12-2019 14:13:52 CST mrwc ERROR - 		FILE: Number of read operations=0
05-12-2019 14:13:52 CST mrwc ERROR - 		FILE: Number of large read operations=0
05-12-2019 14:13:52 CST mrwc ERROR - 		FILE: Number of write operations=0
05-12-2019 14:13:52 CST mrwc ERROR - 		HDFS: Number of bytes read=189
05-12-2019 14:13:52 CST mrwc ERROR - 		HDFS: Number of bytes written=118
05-12-2019 14:13:52 CST mrwc ERROR - 		HDFS: Number of read operations=6
05-12-2019 14:13:52 CST mrwc ERROR - 		HDFS: Number of large read operations=0
05-12-2019 14:13:52 CST mrwc ERROR - 		HDFS: Number of write operations=2
05-12-2019 14:13:52 CST mrwc ERROR - 	Job Counters 
05-12-2019 14:13:52 CST mrwc ERROR - 		Launched map tasks=1
05-12-2019 14:13:52 CST mrwc ERROR - 		Launched reduce tasks=1
05-12-2019 14:13:52 CST mrwc ERROR - 		Rack-local map tasks=1
05-12-2019 14:13:52 CST mrwc ERROR - 		Total time spent by all maps in occupied slots (ms)=4526
05-12-2019 14:13:52 CST mrwc ERROR - 		Total time spent by all reduces in occupied slots (ms)=3723
05-12-2019 14:13:52 CST mrwc ERROR - 		Total time spent by all map tasks (ms)=4526
05-12-2019 14:13:52 CST mrwc ERROR - 		Total time spent by all reduce tasks (ms)=3723
05-12-2019 14:13:52 CST mrwc ERROR - 		Total vcore-milliseconds taken by all map tasks=4526
05-12-2019 14:13:52 CST mrwc ERROR - 		Total vcore-milliseconds taken by all reduce tasks=3723
05-12-2019 14:13:52 CST mrwc ERROR - 		Total megabyte-milliseconds taken by all map tasks=4634624
05-12-2019 14:13:52 CST mrwc ERROR - 		Total megabyte-milliseconds taken by all reduce tasks=3812352
05-12-2019 14:13:52 CST mrwc ERROR - 	Map-Reduce Framework
05-12-2019 14:13:52 CST mrwc ERROR - 		Map input records=7
05-12-2019 14:13:52 CST mrwc ERROR - 		Map output records=21
05-12-2019 14:13:52 CST mrwc ERROR - 		Map output bytes=160
05-12-2019 14:13:52 CST mrwc ERROR - 		Map output materialized bytes=208
05-12-2019 14:13:52 CST mrwc ERROR - 		Input split bytes=113
05-12-2019 14:13:52 CST mrwc ERROR - 		Combine input records=21
05-12-2019 14:13:52 CST mrwc ERROR - 		Combine output records=21
05-12-2019 14:13:52 CST mrwc ERROR - 		Reduce input groups=21
05-12-2019 14:13:52 CST mrwc ERROR - 		Reduce shuffle bytes=208
05-12-2019 14:13:52 CST mrwc ERROR - 		Reduce input records=21
05-12-2019 14:13:52 CST mrwc ERROR - 		Reduce output records=21
05-12-2019 14:13:52 CST mrwc ERROR - 		Spilled Records=42
05-12-2019 14:13:52 CST mrwc ERROR - 		Shuffled Maps =1
05-12-2019 14:13:52 CST mrwc ERROR - 		Failed Shuffles=0
05-12-2019 14:13:52 CST mrwc ERROR - 		Merged Map outputs=1
05-12-2019 14:13:52 CST mrwc ERROR - 		GC time elapsed (ms)=134
05-12-2019 14:13:52 CST mrwc ERROR - 		CPU time spent (ms)=1130
05-12-2019 14:13:52 CST mrwc ERROR - 		Physical memory (bytes) snapshot=291848192
05-12-2019 14:13:52 CST mrwc ERROR - 		Virtual memory (bytes) snapshot=4161122304
05-12-2019 14:13:52 CST mrwc ERROR - 		Total committed heap usage (bytes)=139329536
05-12-2019 14:13:52 CST mrwc ERROR - 	Shuffle Errors
05-12-2019 14:13:52 CST mrwc ERROR - 		BAD_ID=0
05-12-2019 14:13:52 CST mrwc ERROR - 		CONNECTION=0
05-12-2019 14:13:52 CST mrwc ERROR - 		IO_ERROR=0
05-12-2019 14:13:52 CST mrwc ERROR - 		WRONG_LENGTH=0
05-12-2019 14:13:52 CST mrwc ERROR - 		WRONG_MAP=0
05-12-2019 14:13:52 CST mrwc ERROR - 		WRONG_REDUCE=0
05-12-2019 14:13:52 CST mrwc ERROR - 	File Input Format Counters 
05-12-2019 14:13:52 CST mrwc ERROR - 		Bytes Read=76
05-12-2019 14:13:52 CST mrwc ERROR - 	File Output Format Counters 
05-12-2019 14:13:52 CST mrwc ERROR - 		Bytes Written=118
05-12-2019 14:13:52 CST mrwc INFO - Process completed successfully in 26 seconds.
05-12-2019 14:13:52 CST mrwc INFO - Finishing job mrwc at 1575526432377 with status SUCCEEDED

这时在hdfs上的azkaban下就有了output等等
在这里插入图片描述
再回到hdp-3上查看:

[root@hdp-3 ~]# hadoop fs -cat /azkaban/output/part-r-00000
1       1
10      1
15      1
18      1
2       1
20      1
23      1
25      1
3       1
4       1
5       1
6       1
7       1
9       1
blacke  1
blue    1
green   1
orange  1
red     1
white   1
yellow  1
[root@hdp-3 ~]#

可以看到已经根据空格切开!

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值