Java 命令行模式下编译 MapReduce 程序

不使用 eclipse 这些工具,直接使用 java 命令在命令行下编译 mapreduce 程序。


首先需要设置CLASSPATH环境变量:



将/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/hadoop-0.20.1-core.jar和lib下的commons-cli-1.2.jar添加进入环境变量,


export CLASSPATH=.:$CLASSPATH:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/hadoop-0.20.1-core.jar:/root/hadoop/hadoop-0.20.1/hadoop-0.20.1/lib/commons-cli-1.2.jar

这里必须是.jar的路径,而不是上层目录的路径,否则javac编译时找不到相应类,

环境变量设置建议:

(1)不要将.写在环境变量里;


(2)用脚本来控制环境变量 脚本中环境变量的有效范围是脚本中的后续命令


比如在脚本debug.sh中export CLASSPATH=*****,然后CLASSPATH就对后续的命令有效,脚本执行完毕后CLASSPATH还是原来的值。


实例脚本
用途:编译和运行Hadoop应用程序
脚本内容:


export CLASSPATH=$CLASSPATH:$HADOOP_HOME/hadoop-0.20.1-core.jar:$HADOOP_HOME/lib/commons-cli-1.2.jar:$HADOOP_HOME/lib/commons-logging-1.0.4.jar


javac WordCountAnalyzer.java  编译时需要相应的jar包


java  WordCountAnalyzer       运行时也需要相应的jar包



注意:编译时需要jar包,运行时也需要jar包,假如将java  WordCountAnalyzer 命令放在脚本外运行,则因为CLASSPATH又变回原来的值,则运行会提示错误。
这里的CLASSPATH变量的值的作用域是export 后面的命令


此外我们在编写java程序时,包名最好规范命名,
例如commons-cli-1.2.jar下的包应该命名为
org.apache.commons.cli
org.apache.commons.cli.*等,
这样方便在程序中找到某一个类在哪个jar当中。
作业 由于本人是AI语言模型,无法提交Java程序,但是可以提供Java程序的基本框架和示例代码供您参考: MapReduce程序基本框架: ``` import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.*; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class WordCount { public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } } public static class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = Job.getInstance(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); } } ``` 示例代码: ``` import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.*; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class WordCount { public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } } public static class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = Job.getInstance(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); } } ``` 使用命令行提交MapReduce作业: 1. 将示例代码保存为WordCount.java文件 2. 在命令行中使用以下命令进行编译: ``` javac -classpath `yarn classpath` WordCount.java ``` 3. 打包: ``` jar -cvf WordCount.jar *.class ``` 4. 提交MapReduce作业: ``` yarn jar WordCount.jar WordCount /input /output ``` 其中,/input是输入文件的路径,/output是输出文件的路径。 运行完成后,在输出文件夹中查看结果。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值