集群上使用
jar包
- 首先将之前
FileExist
文件进行打包,得到.jar
文件:
- 将其拷贝到集群中,并使用
hadoop jar
命令运行:
WordCount
添加依赖
- 首先我们需要新建一个
WordCount
项目,首先要添加Hadoop
的包依赖
/usr/local/hadoop/share/hadoop/common
hadoop-common-xxx.jar
hadoop-nfs-xxx.jar
/usr/local/hadoop/share/hadoop/common/lib
下的所有Jar包
/usr/local/hadoop/share/hadoop/mapreduce
该目录下所有JAR包
/usr/local/hadoop/share/hadoop/mapreduce/lib
目录下所有JAR包
![53932706913](http://wx4.sinaimg.cn/mw690/0060lm7Tly1fw5gvha378j30pg0acabt.jpg)
编写程序
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;