Win下hadoop、eclipse开发环境搭建

最新推荐文章于 2024-10-08 09:12:18 发布

chongdutuo9831

最新推荐文章于 2024-10-08 09:12:18 发布

阅读量147

点赞数

文章标签：大数据开发工具操作系统

原文链接：https://my.oschina.net/u/3055303/blog/878979

版权

一、hadoop下载

hadoop下载以及安装和配置请查看上一篇文章：hadoop2.7.3安装和配置

windows下和Linux差不多，配置也可以复用

二、eclipse下hadoop插件

自己编译或者网上下载(附：hadoop2.7.3 win下插件下载地址)一个都可，本地我使用的是hadoop2.7.3，下载的请注意，需要两部分文件

(1)、hadoop-eclipse-plugin-2.7.3.jar

(2)、win下bin环境，详细文件如下图

hadoop windows下bin
确认两部分文件都有的情况下，完成以下三步操作：

(1)、首先将hadoop-eclipse-plugin-2.7.3.jar拷贝到Eclipse的plugins的目录下

(2)、将hadoop.dll拷贝到C:\Windows\System32目录下

(3)、将上图中bin下的所有文件拷贝到hadoop安装目录的bin目录下

三、eclipse下hadoop插件的配置

(1)、重启eclipse，window-preferences中会出现Hadoop Map/Reduce选项，选中并设置hadoop在windows下的目录

配置hadoop位置

(2)、在show view中把map/reduce显示到工具栏

显示MapReaduce

(3)、MapReaduce配置

点击后面的黑色锯齿，进入如下界面，并设置如下：

MapReaduce配置

配置Map/Reduce Master和DFS Mastrer，Host和Port配置成与core-site.xml的一致
按照我这样的配置，需要在windows的hosts（C:\Windows\System32\drivers\etc）文件中要增加localhost信息：127.0.0.1 localhost

(4)、Advanced Parameters配置

为避免出现问题，需修改一下三个参数

Hadoop.tmp.dir需要设置为core-site.xml里hadoop.tmp.dir设置一致
Dfs.replication需要设置为hdfs-site.xml里面的dfs.replication一致
Dfs.permissions.enabled设置为false

四、编写简单的WordCount测试

(1)、启动hadoop，验证是否启动

jps

也可以输入以下两个网址：
http://localhost:8088
http://localhost:50070

(2)、eclipse下新建MapReduce项目

MapReduce1

MapReduce2

MapReduce3

MapReduce4

MapReduce5

MapReduce6

设置执行参数
在类库中按右键选择run As>Run Configurations进行设置

MapReduce7

MapReduce8

上传需要分析的文档到HDFS中：

MapReduce9

注意上传的文件路径要和上面设置的参数一致才行

执行程序查看结果：

MapReduce10

MapReduce11

wordCount代码如下：

public class WordCount {

    public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable> {
        private final static IntWritable one = new IntWritable(1);
        private Text word = new Text();

        public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
            StringTokenizer itr = new StringTokenizer(value.toString());
            while (itr.hasMoreTokens()) {
                word.set(itr.nextToken());
                context.write(word, one);
            }
        }
    }

    public static class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
        private IntWritable result = new IntWritable();

        public void reduce(Text key, Iterable<IntWritable> values, Context context)
                throws IOException, InterruptedException {
            int sum = 0;
            for (IntWritable val : values) {
                sum += val.get();
            }
            result.set(sum);
            context.write(key, result);
        }
    }

    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
        if (otherArgs.length != 2) {
            System.err.println("Usage: wordcount <in> <out>");
            System.exit(2);
        }
        Job job = new Job(conf, "word count");
        job.setJarByClass(WordCount.class);
        job.setMapperClass(TokenizerMapper.class);
        job.setCombinerClass(IntSumReducer.class);
        job.setReducerClass(IntSumReducer.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
        FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }
}

转载于:https://my.oschina.net/u/3055303/blog/878979