MapReduce的WordCount在集群中的实现
如何用eclipse编写java代码,连接到本地的虚拟机集群,实现wordcount这个经典的例子?
1.创建一个maven工程,然后导入相关的pom依赖
<repositories>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
</repositories>
<dependencies>
<!--mapreduce需要的jar-->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.6.0-mr1-cdh5.14.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>