MapReduce开发环境搭建
1官网下载hadoop3.1.4.tar.gz并解压缩
下载hadoop.dll及winexe,下载链接:
https://github.com/ordinaryload/Hadoop-tools
将winexe复制到hadoop-3.1.4/bin,将hadoop.dll复制到windows/system32目录下
2编写程序,按照bili视频编写map,reduce,driver类。
也可按照此博客,使用源码示例,博客地址:
https://www.cnblogs.com/xingluo/p/9512961.html
3添加NativeIO,由于win10摒弃此函数,需自己创建。
参考此博客处理:https://blog.csdn.net/weixin_42229056/article/details/82686172
4直接运行driver的main函数。
源码
driver:
package com.weitao.mr.wordcount;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import java.io.IOException;
public class WordCountDriver {
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
System.setProperty("hadoop.home.dir", "C:\\Users\\asus\\Desktop\\hadoop-3.1.4\\hadoop-3.1.4");
//获取job对象
Configuration conf=new Configuration();
Job job=Job.getInstance(conf);
//设置jar位置
job.setJarByClass(WordCountDriver.class);
//关联map和reduce
job.setMapperClass(WordCountMapper.class);
job.setReducerClass(WordCountReducer.class);
//设置mapper阶段输出数据key和value类型
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
//设置最终数据输出的key和value类型
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
//设置输入路径和输出路径
FileInputFormat.setInputPaths(job,