【大数据之Hadoop】二十四、Yarn的tool接口

把自写的程序在集群上运行,使用动态传参。

hadoop jar wc.jar com.study.mapreduce.wordcount.WordCountDriver -Dmapreduce.job.queuename=root.test /input /output

传入的参数的数组元素有3个,分别是-Dmapreduce.job.queuename=root.test、/inpu、 /output,而程序里的输入输出路径为传入数组的第一第二个元素,所以需要编写Yarn的Tool接口动态修改参数。

步骤:
(1)新建Maven项目YarnDemo,修改pom:

<?xmlversion="1.0" encoding="UTF-8"?>
<projectxmlns="http://maven.apache.org/POM/4.0.0"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://maven.apache.org/POM/4.0.0http://maven.apache.org/xsd/maven-4.0.0.xsd">
   <modelVersion>4.0.0</modelVersion>

   <groupId>com.study.hadoop</groupId>
   <artifactId>yarn_tool_test</artifactId>
    <version>1.0-SNAPSHOT</version>

    <dependencies>
        <dependency>
           <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>3.1.3</version>
        </dependency>
    </dependencies>
</project>

(2)新建包名com.study.yarn

(3)创建类WordCount并实现Tool接口:

package com.study.yarn;

importorg.apache.hadoop.conf.Configuration;
importorg.apache.hadoop.fs.Path;
importorg.apache.hadoop.io.IntWritable;
importorg.apache.hadoop.io.LongWritable;
importorg.apache.hadoop.io.Text;
importorg.apache.hadoop.mapreduce.Job;
importorg.apache.hadoop.mapreduce.Mapper;
importorg.apache.hadoop.mapreduce.Reducer;
importorg.apache.hadoop.mapreduce.lib.input.FileInputFormat;
importorg.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
importorg.apache.hadoop.util.Tool;

importjava.io.IOException;

public class WordCount implements Tool {
   //包含mapper和reducer,且传统的驱动也要包含在run()函数里
    private Configuration conf;

    @Override
    public int run(String[] args) throwsException {

        Job job = Job.getInstance(conf);

        job.setJarByClass(WordCountDriver.class);

       job.setMapperClass(WordCountMapper.class);
       job.setReducerClass(WordCountReducer.class);

        job.setMapOutputKeyClass(Text.class);
       job.setMapOutputValueClass(IntWritable.class);
        job.setOutputKeyClass(Text.class);
       job.setOutputValueClass(IntWritable.class);

        FileInputFormat.setInputPaths(job, newPath(args[0]));
        FileOutputFormat.setOutputPath(job, newPath(args[1]));

        return job.waitForCompletion(true) ? 0: 1;
    }

    @Override
    public void setConf(Configuration conf) {
        this.conf = conf;
    }

    @Override
    public Configuration getConf() {
        return conf;
    }

    //mapper
    public static class WordCountMapper extendsMapper<LongWritable, Text, Text, IntWritable> {

       private Text outK = new Text();
        private IntWritable outV = newIntWritable(1);

        @Override
        protected void map(LongWritable key,Text value, Context context) throws IOException, InterruptedException {

            String line = value.toString();
            String[] words = line.split("");

            for (String word : words) {
                outK.set(word);

                context.write(outK, outV);
            }
        }
    }
    
    
    //reducer
    public static class WordCountReducer extendsReducer<Text, IntWritable, Text, IntWritable> {
        private IntWritable outV = newIntWritable();

        @Override
        protected void reduce(Text key,Iterable<IntWritable> values, Context context) throws IOException, InterruptedException{

            int sum = 0;

            for (IntWritable value : values) {
                sum += value.get();
            }
            outV.set(sum);

            context.write(key, outV);
        }
    }
}

(4)新建WordCountDriver

packagecom.study.yarn;

importorg.apache.hadoop.conf.Configuration;
importorg.apache.hadoop.util.Tool;
importorg.apache.hadoop.util.ToolRunner;
importjava.util.Arrays;

public classWordCountDriver {
   //tool的驱动
    private static Tool tool;

    public static void main(String[] args)throws Exception {
        // 1. 创建配置文件
        Configuration conf = newConfiguration();

        // 2. 判断是否有tool接口
        switch (args[0]){
            case "wordcount":
                tool = new WordCount();
                break;
            default:
                throw newRuntimeException(" No such tool: "+ args[0] );
        }
        // 3. 用Tool执行程序
        // Arrays.copyOfRange 将老数组的元素放到新数组里面,最后两个数组元素即输入输出路径
        int run = ToolRunner.run(conf, tool, Arrays.copyOfRange(args, args.length-2,args.length));

        System.exit(run);
    }
}

(5)给项目打jar包放到集群环境中。

(6)进到jar包的存放目录,向集群提交jar文件执行,此时为3个参数,第一个用于生成特定的Tool,第二个和第三个为输入输出目录,显示正常运行。

yarn jar YarnDemo.jar com.study.yarn.WordCountDriver wordcount /input /output

(7)在wordcount后面添加参数,也就是4个参数

yarn jar YarnDemo.jar com.study.yarn.WordCountDriver wordcount -Dmapreduce.job.queuename=root.test /input /output1
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值