hadoop开发与调试(Windows10)

Intellij idea 本地开发调试hadoop的方法

使用软件的版本信息

Intellij idea版本: 2016.3.2

hadoop版本: 2.6.5

数据集: **Hadoop: The Definitive Guide
**(使用《hadoop权威指南》的1901和1902的两年的天气统计的源码实例)

附上源码

hadooptestMapper

    package cn.zcs.zzuli;

    import org.apache.hadoop.io.IntWritable;
    import org.apache.hadoop.io.LongWritable;
    import org.apache.hadoop.io.Text;
    import org.apache.hadoop.mapreduce.Mapper;

    import java.io.IOException;

    /**
     * Created by 张超帅 on 2018/6/23.
     */
    public class hadooptestMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
        private static final int MISSING = 9999;

        @Override
        protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
            String line = value.toString();
            String year = line.substring(15, 19);
            int airTemperature;
            if (line.charAt(87) == '+') { // parseInt doesn't like leading plus signs
                airTemperature = Integer.parseInt(line.substring(88, 92));
            } else {
                airTemperature = Integer.parseInt(line.substring(87, 92));
            }
            String quality = line.substring(92, 93);
            if (airTemperature != MISSING && quality.matches("[01459]")) {
                context.write(new Text(year), new IntWritable(airTemperature));
            }
        }
    }

hadooptestReducer

    package cn.zcs.zzuli;

    import org.apache.hadoop.io.IntWritable;
    import org.apache.hadoop.io.Text;
    import org.apache.hadoop.mapreduce.Reducer;

    import java.io.IOException;

    /**
     * Created by 张超帅 on 2018/6/23.
     */
    public class hadooptestReducer extends Reducer<Text, IntWritable, Text, IntWritable>{
        @Override
        protected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
            int  maxValue = Integer.MIN_VALUE;
            for(IntWritable value : values) {
                maxValue = Math.max(maxValue, value.get());
            }
            context.write(key, new IntWritable(maxValue));
        }
    }

hadooptest

    package cn.zcs.zzuli;

    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.IntWritable;
    import org.apache.hadoop.io.Text;
    import org.apache.hadoop.mapreduce.Job;
    import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
    import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

    import java.io.File;

    /**
     * Created by 张超帅 on 2018/6/23.
     */
    public class hadooptest {
        public static void main(String[] args) throws Exception {
            if (args.length != 2) {
                System.err.println("Usage: MaxTemperature <input path> <output path>");
                System.exit(-1);
            }
            Job job = Job.getInstance(new Configuration());
            job.setJarByClass(hadooptest.class);
            job.setJobName("Max temperature");

            File inputdir = new File(args[0]);
            File[] files = inputdir.listFiles();
            for (File file : files) {
                FileInputFormat.addInputPath(job, new Path(file.toString()));
            }


            FileOutputFormat.setOutputPath(job, new Path(args[1]));

            job.setMapperClass(hadooptestMapper.class);
            job.setReducerClass(hadooptestReducer.class);

            job.setOutputKeyClass(Text.class);              //注1
            job.setOutputValueClass(IntWritable.class);

            System.exit(job.waitForCompletion(true) ? 0 : 1);
        }
    }

建立项目 (File->new->project)

  • 创建包->在包里建立如上三个类

  • 项目结构如下图所示:

hadoop开发配置

  • File->Project Structure->Modules
  • 点击加号->JARs or dectories

  • 在Project Structure 中找到Artifacts点击。
  • 点击加号添加JARs

  • 点击加号->module output->OK;

  • 完成

  • Edit Configurations

  • 在Edit Configurations 界面点击加号->Application

  • 编辑各个参数信息

- (1) main class需要输入org.apache.hadoop.util.RunJar
- (2) program arguments,填写参数如上图所示:
    - 第一个参数之前在project structure中填写的jar文件路径,第二个参数是输入文件的目录,第二个参数是输出文件的路径

- (3) 在项目中新建一个输入路径并将输入文件放进去(输出文件不用建立,系统自己建立的)

FAQ

找不到类

  • 发现刚才填写参数的地方着了一个参数,需要将main函数所在类的路径添加进去:

eclipse 中运行 Hadoop2.xx map reduce程序 出现错误(null) entry in command string: null chmod 0700

    Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
        at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
        at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:609)
        at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:977)
        at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:187)
        at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174)
        at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:108)
        at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:285)
        at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:344)
        at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
        at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
        at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:115)
        at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:125)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:163)
        at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:731)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
        at com.jack.hadoop.temperature.MinTemperature.main(MinTemperature.java:37)

输入文件的目录问题(出现权限问题)

Hadoop报错:Failed to locate the winutils binary in the hadoop binary path

  • 这个也是文件目录问题
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值