基于Windows7 + jdk1.8.0_162 + eclipse4.7.2 + Hadoop2.7.7
一、安装eclipse(自行百度)
二、安装jdk(自行百度)
三、下载maven仓库并在eclipse中配置maven环境(后面的文章讲)
四、在eclipse中新建一个maven项目
五、在pom.xml文件中添加相关依赖
dependencies>
<dependency> 注:这个依赖根据自己环境来看是否需要这部分
<groupId>jdk.tools</groupId>
<artifactId>jdk.tools</artifactId>
<version>1.8</version>
<scope>system</scope>
<systemPath>D:\Java\jdk1.8.0_162/lib/tools.jar</systemPath>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>RELEASE</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>2.8.2</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.7.2</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.7.7</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.7.7</version>
</dependency>
</dependencies>
六、在项目的src/main/resources目录下,新建一个文件,命名为“log4j.properties”,在文件中填入。
log4j.rootLogger=INFO, stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d %p [%c] - %m%n
log4j.appender.logfile=org.apache.log4j.FileAppender
log4j.appender.logfile.File=target/spring.log
log4j.appender.logfile.layout=org.apache.log4j.PatternLayout
log4j.appender.logfile.layout.ConversionPattern=%d%p[%c]- %m%n
七、编写WordCount程序(WordCount实例下载)
(1)编写Mapper类
(2)编写Reducer类
(3)编写Driver类
八、运行行时报错
ERROR [org.apache.hadoop.util.Shell] - Failed to locate the winutils binary in the hadoop binary pat…
九、在Windows下安装Hadoop
(1)下载Hadoop(Hadoop官网下载地址:https://hadoop.apache.org/releases.html);
(2)下载hadoop-2.7.7.tar.gz压缩包并进行解压(非中文目录)
十、配置环境变量
(1)添加HADOOP_HOME
(2)在path中添加Hadoop
十一、若解压的Hadoop的bin目录中没有winutils.exe文件时,需要向bin目录中添加一个winutils.exe(hadoop-common-2.7.1-bin-master-master.zip下载)
解压到Hadoop的bin目录中;
十二、运行WordCount程序
十三、程序执行成功后生成的文件(打开最后一个文件就是计数的结果)