Hadoop入门--Windows上Eclipse安装Hadoop插件

Windows上Eclipse安装Hadoop插件

Windows上安装eclipse插件的安装,连接远程centos上的Hadoop环境并进行wordcount测试。Hadoop版本2.8.4。

1.下载Hadoop插件hadoop2x-eclipse-plugin-master

下载地址:https://github.com/winghc/hadoop2x-eclipse-plugin,下载完成后解压,将release/hadoop-eclipse-plugin-2.6.0.jar文件复制到eclipse下的plugins目录下,启动eclipse。

2. 本地配置Hadoop环境

1)下载Hadoop软件包hadoop-2.8.4.tar.gz,解压并配置环境变量,为了防止出错,我本机的Hadoop和远程centos上的Hadoop版本和配置都是一样的。

2)本地环境变量配置如下图:

 

 

HADOOP_HOME=H:\Hadoop\hadoop-2.8.4,将HADOOP_HOME放到path变量中,

 

3.启动eclipse

(1)打开Window->show view ,如下图,打开MapReduce tools

 

(2)选择本地Hadoop目录

  

 

(3)配置Hadoop location

 

 

配置完成后出现DFS Locations

 

Outputtest_input下的文件是我在远程服务器上,通过命令执行的结果,此时刷新后下载下来的。

4.编写程序

(1)新建MapReduce工程hadoop01

 

 

(2)新建WordCount测试类

程序功能:统计指定文件内单词出现的次数。

package com.hadoop.test;

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

public class WordCountTest {

    public static class WordCountMap extends Mapper<LongWritable, Text, Text, IntWritable> {
        private final IntWritable one = new IntWritable(1);
        private Text word = new Text();
        
        public void map(LongWritable key, Text value, Context context)
                throws IOException, InterruptedException {
            String line = value.toString();
            StringTokenizer token = new StringTokenizer(line);
            while (token.hasMoreTokens()) {
                word.set(token.nextToken());
                context.write(word, one);
            }
        }
    }
    
    public static class WordCountReduce extends Reducer<Text, IntWritable, Text, IntWritable> {
        public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
            int sum = 0;
            for (IntWritable val : values) {
                sum += val.get();
            }
            context.write(key, new IntWritable(sum));
        }
    }

    @SuppressWarnings("deprecation")
    public static void main(String[] args) throws Exception{
        Configuration conf = new Configuration();
        Job job = new Job(conf);
        job.setJarByClass(WordCountTest.class);
        job.setJobName("wordcount");
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        job.setMapperClass(WordCountMap.class); 
        job.setReducerClass(WordCountReduce.class);
        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);
        FileInputFormat.addInputPath(job, new Path("hdfs://172.16.247.129:9000/user/root/test_input"));
        FileOutputFormat.setOutputPath(job, new Path("hdfs://172.16.247.129:9000/user/root/output/result"));
        job.waitForCompletion(true);

    }

}

(3)在DFS Locations  test_input下新建测试文件test3.txt,如果存在权限问题参考5

(4)运行 Run as -->Run on Hadoop

5.遇到的问题

在编写程序过程中会有问题出现,在此简单总结一下:

(1)错误信息:

log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

Exception in thread "main" java.lang.RuntimeException: java.io.FileNotFoundException: Could not locate Hadoop executable: H:\Hadoop\hadoop-2.8.4\bin\winutils.exe -see https://wiki.apache.org/hadoop/WindowsProblems

at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:716)

at org.apache.hadoop.util.Shell.getSetPermissionCommand(Shell.java:250)

at org.apache.hadoop.util.Shell.getSetPermissionCommand(Shell.java:267)

at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:771)

at org.apache.hadoop.fs.RawLocalFileSystem.mkOneDirWithMode(RawLocalFileSystem.java:515)

at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:555)

at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:533)

at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:320)

at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:133)

at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:146)

at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)

at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1840)

at org.apache.hadoop.mapreduce.Job.submit(Job.java:1338)

at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1359)

at com.hadoop.test.WordCountTest.main(WordCountTest.java:59)

Caused by: java.io.FileNotFoundException: Could not locate Hadoop executable: H:\Hadoop\hadoop-2.8.4\bin\winutils.exe -see https://wiki.apache.org/hadoop/WindowsProblems

at org.apache.hadoop.util.Shell.getQualifiedBinInner(Shell.java:598)

at org.apache.hadoop.util.Shell.getQualifiedBin(Shell.java:572)

at org.apache.hadoop.util.Shell.<clinit>(Shell.java:669)

at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79)

at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1555)

at org.apache.hadoop.security.SecurityUtil.getLogSlowLookupsEnabled(SecurityUtil.java:497)

at org.apache.hadoop.security.SecurityUtil.<clinit>(SecurityUtil.java:90)

at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:289)

at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:277)

at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:833)

at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:803)

at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:676)

at org.apache.hadoop.mapreduce.task.JobContextImpl.<init>(JobContextImpl.java:72)

at org.apache.hadoop.mapreduce.Job.<init>(Job.java:142)

at org.apache.hadoop.mapreduce.Job.<init>(Job.java:129)

at com.hadoop.test.WordCountTest.main(WordCountTest.java:48)

winutils.exe是在Windows系统上需要的hadoop调试环境工具,里面包含一些在Windows系统下调试hadoopspark所需要的基本的工具类,另外在使用eclipse调试hadoop程序是,也需要winutils.exe

解决方案:

下载工具:https://github.com/steveloughran/winutils

解压winutils-master文件,选择版本,将H:\Hadoop\winutils-master\hadoop-2.8.3\bin下的

winutils.exehadoop.dll分别复制到本地Hadoop安装目录下H:\Hadoop\hadoop-2.8.4\binC:\Windows\System32

(2)

Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z

at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)

at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:606)

at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:958)

at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:100)

at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:65)

at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:314)

at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:377)

at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:151)

at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:132)

at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:116)

at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:125)

at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:171)

at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:758)

at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:242)

at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)

at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1840)

at org.apache.hadoop.mapreduce.Job.submit(Job.java:1338)

at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1359)

at com.hadoop.test.WordCountTest.main(WordCountTest.java:59)

解决方案:

下载Hadoop源码并解压,找到NativeIO.java

hadoop-2.8.4-src\hadoop-2.8.4-src\hadoop-common-project\hadoop-common\src\main\java\org\apache\hadoop\io\nativeio\NativeIO.java将其复制到工程hadoop01src下并做如下修改:

 

(3)涉及到权限的问题

如果涉及到文件的权限问题,例Permission denied: user=root, access=WRITE

解决方案1:修改服务器上hadoop-2.8.4/etc/hadoop/hdfs-site.xml

添加如下:

<property>

<name>dfs.permissions</name>

    <value>false</value>

</property>

解决方案2

修改权限:sudo -u hdfs hadoop fs -mkdir /user/root (未测)

6.测试

新建test3.txt并上传,test_input右键单击upload files to DFS...,选择上传的文件test3.txt,

上传后我的test_input下有三个文件,test1test2是我之前测试的文件

运行 Run as -->Run on Hadoop,

运行完成后,刷新DFS Locations,结果统计的是test_input下的三个文件

 

 

转载于:https://www.cnblogs.com/quyanhui/p/9323769.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值