版本: hadoop 2.6.5
第一次参考别人的内容写hadoop的mapreduce程序,花了两天时间调试,有点慢,好在调通,反复研究也学到不少东西。
[hadoop@master ~]$ cd file
[hadoop@master file]$ ls
file1.txt file2.txt
[hadoop@master ~]$ hadoop fs -mkdir /user
[hadoop@master ~]$ hadoop fs -mkdir /user/hadoop
[hadoop@master ~]$ hadoop fs -mkdir /user/hadoop/wc_input
[hadoop@master ~]$ hadoop fs -put /home/hadoop/file/ /user/hadoop/wc_input
[hadoop@master ~]$ hadoop fs -ls hdfs://master:9000/user/hadoop/wc_input
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2016-12-31 12:15 hdfs://master:9000/user/hadoop/wc_input/file
[hadoop@master ~]$ hadoop fs -ls hdfs://master:9000/user/hadoop/wc_input/file
Found 2 items
-rw-r--r-- 2 hadoop supergroup 18 2016-12-31 12:15 hdfs://master:9000/user/hadoop/wc_input/file/file1.txt
-rw-r--r-- 2 hadoop supergroup 17 2016-12-31 12:15 hdfs://master:9000/user/hadoop/wc_input/file/file2.txt
--这里不需要输入类名(把输入路径当成输出了),花了好长时间研究
[hadoop@master ~]$ hadoop jar hadoop.jar com.yu.hadoop.WordCount wc_input/file wc_output
16/12/31 12:17:47 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://master:9000/user/hadoop/wc_input/file already exists
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:267)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:140)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1297)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1294)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1294)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1315)
at com.yu.hadoop.WordCount.main(WordCount.java:65)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
--把类名去掉,正常运行
因为MANIFEST.MF里面已经有配置
Manifest-Version: 1.0
Main-Class: com.yu.hadoop.WordCount
[hadoop@master ~]$ hadoop jar hadoop.jar wc_input/file wc_output1
16/12/31 12:18:57 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032