6.2 解读WordCount

第6章 MapReduce入门


6.2 解读WordCount

WordCount程序就是MapReduce的HelloWord程序。通过对WordCount程序分析,我们可以了解MapReduce程序的基本结构和执行过程。

6.2.1 WordCount设计思路

WordCount程序很好的体现了MapReduce编程思想。 
一般来说,本文作为MapReduce的输入,MapReduce会将文本进行切分处理并将行号作为输入键值对的键,文本内容作为键值对的值,经map方法处理后,输出中间结果为<word,1>形式。MapReduce会默认按键值分发给reduce方法,在完成计数并输出最后结果<word,count>

这里写图片描述

6.2.2 MapReduce运行方式

MapReduce运行方式分为本地运行和服务端运行两种。 
本地运行多指本地Windows环境,方便开发调试。 
而服务端运行,多用于实际生产环境。

6.2.3 编写代码

(1)创建Java 项目

这里写图片描述

(2)修改Hadoop源码 
注意,在Windows本地运行MapReduce程序时,需要修改Hadoop源码。如果在Linux服务器运行,则不需要修改Hadoop源码。

修改Hadoop源码,其实就是简单修改一下Hadoop的NativeIO类的源码

下载对应hadoop源代码,hadoop-2.7.3-src.tar.gz解压,hadoop-2.7.3-src\hadoop-common-project\hadoop-common\src\main\java\org\apache\hadoop\io\nativeio下NativeIO.java 复制到对应的Eclipse的project. 
修改代码


     
     
  1. public static boolean access (String path, AccessRight desiredAccess)
  2. throws IOException {
  3. return true;
  4. //return access0(path, desiredAccess.accessRight());
  5. }
  • 1
  • 2
  • 3
  • 4
  • 5

如果不修改NativeIO类的源码,在Windows本地运行MapReduce程序会产生异常:


     
     
  1. log4j:WARN No appenders could be found for logger (org .apache .hadoop .metrics2 .lib .MutableMetricsFactory).
  2. log4j:WARN Please initialize the log4j system properly.
  3. log4j:WARN See http:/ /logging .apache .org /log4j/ 1.2/faq .html #noconfig for more info.
  4. Exception in thread "main" java .lang .UnsatisfiedLinkError : org .apache .hadoop .io .nativeio .NativeIO$Windows .access 0(Ljava/lang/String ;I)Z
  5. at org .apache .hadoop .io .nativeio .NativeIO$Windows .access 0(Native Method)
  6. at org .apache .hadoop .io .nativeio .NativeIO$Windows .access(NativeIO .java : 609)
  7. at org .apache .hadoop .fs .FileUtil .canRead(FileUtil .java : 977)
  8. at org .apache .hadoop .util .DiskChecker .checkAccessByFileMethods(DiskChecker .java : 187)
  9. at org .apache .hadoop .util .DiskChecker .checkDirAccess(DiskChecker .java : 174)
  10. at org .apache .hadoop .util .DiskChecker .checkDir(DiskChecker .java : 108)
  11. at org .apache .hadoop .fs .LocalDirAllocator$AllocatorPerContext .confChanged(LocalDirAllocator .java : 285)
  12. at org .apache .hadoop .fs .LocalDirAllocator$AllocatorPerContext .getLocalPathForWrite(LocalDirAllocator .java : 344)
  13. at org .apache .hadoop .fs .LocalDirAllocator .getLocalPathForWrite(LocalDirAllocator .java : 150)
  14. at org .apache .hadoop .fs .LocalDirAllocator .getLocalPathForWrite(LocalDirAllocator .java : 131)
  15. at org .apache .hadoop .fs .LocalDirAllocator .getLocalPathForWrite(LocalDirAllocator .java : 115)
  16. at org .apache .hadoop .mapred .LocalDistributedCacheManager .setup(LocalDistributedCacheManager .java : 125)
  17. at org .apache .hadoop .mapred .LocalJobRunner$Job.<init>(LocalJobRunner .java : 163)
  18. at org .apache .hadoop .mapred .LocalJobRunner .submitJob(LocalJobRunner .java : 731)
  19. at org .apache .hadoop .mapreduce .JobSubmitter .submitJobInternal(JobSubmitter .java : 240)
  20. at org .apache .hadoop .mapreduce .Job$10 .run(Job .java : 1290)
  21. at org .apache .hadoop .mapreduce .Job$10 .run(Job .java : 1287)
  22. at java .security .AccessController .doPrivileged(Native Method)
  23. at javax .security .auth .Subject .doAs(Unknown Source)
  24. at org .apache .hadoop .security .UserGroupInformation .doAs(UserGroupInformation .java : 1698)
  25. at org .apache .hadoop .mapreduce .Job .submit(Job .java : 1287)
  26. at org .apache .hadoop .mapreduce .Job .waitForCompletion(Job .java : 1308)
  27. at cn .hadron .mr .RunJob .main(RunJob .java : 33)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27

(3)定义Mapper类


     
     
  1. package cn.hadron.mr;
  2. import java.io.IOException;
  3. import org.apache.hadoop.io.IntWritable;
  4. import org.apache.hadoop.io.LongWritable;
  5. import org.apache.hadoop.io.Text;
  6. import org.apache.hadoop.mapreduce.Mapper;
  7. import org.apache.hadoop.util.StringUtils;
  8. //4个泛型参数:前两个表示map的输入键值对的key和value的类型,后两个表示输出键值对的key和value的类型
  9. public class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable>{
  10. //该方法循环调用,从文件的split中读取每行调用一次,把该行所在的下标为key,该行的内容为value
  11. protected void map(LongWritable key, Text value,Context context)
  12. throws IOException, InterruptedException {
  13. String[] words = StringUtils.split(value.toString(), ' ');
  14. for(String w :words){
  15. context.write( new Text(w), new IntWritable( 1));
  16. }
  17. }
  18. }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20

代码说明:

  • Mapper类用于读取数据输入并执行map方法,编写Mapper类需要继承org.apache.hadoop.mapreduce.Mapper类,并且根据相应问题实现map方法。
  • Mapper类的4个泛型参数:前两个表示map的输入键值对的key和value的类型,后两个表示输出键值对的key和value的类型
  • MapReduce计算框架会将键值对作为参数传递给map方法。该方法有3个参数,第1个是Object类型(一般使用LongWritable类型)参数,代表行号,第2个是Object类型参数(一般使用Text类型),代表该行内容,第3个Context参数,代表上下文。
  • Context类全名是org.apache.hadoop.mapreduce.Mapper.Context,也就是说Context类是Mapper类的静态内容类,在Mapper类中可以直接使用Context类。
  • 在map方法中使用StringUtils的split方法,按空格将输入行内容分割成单词,然后通过Context类的write方法将其作为中间结果输出。

(4)定义Reducer类


     
     
  1. package cn.hadron.mr;
  2. import java.io.IOException;
  3. import org.apache.hadoop.io.IntWritable;
  4. import org.apache.hadoop.io.Text;
  5. import org.apache.hadoop.mapreduce.Reducer;
  6. public class WordCountReducer extends Reducer<Text, IntWritable, Text, IntWritable>{
  7. /**
  8. * Map过程输出<key,values>中key为单个单词,而values是对应单词的计数值所组成的列表,Map的输出就是Reduce的输入,
  9. * 每组调用一次,这一组数据特点:key相同,value可能有多个。
  10. * /所以reduce方法只要遍历values并求和,即可得到某个单词的总次数。
  11. */
  12. protected void reduce(Text key, Iterable<IntWritable> values,Context context)
  13. throws IOException, InterruptedException {
  14. int sum = 0;
  15. for(IntWritable i: values){
  16. sum=sum+i.get();
  17. }
  18. context.write(key, new IntWritable(sum));
  19. }
  20. }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22

代码说明:

  • Reducer类用于接收Mapper输出的中间结果作为Reducer类的输入,并执行reduce方法。
  • Reducer类的4个泛型参数:前2个代表reduce方法输入的键值对类型(对应map输出类型),后2个代表reduce方法输出键值对的类型
  • reduce方法参数:key是单个单词,values是对应单词的计数值所组成的列表,Context类型是org.apache.hadoop.mapreduce.Reducer.Context,是Reducer的上下文。

(6)定义主方法(主类)


     
     
  1. package cn .hadron .mr ;
  2. import org .apache .hadoop .conf .Configuration ;
  3. import org .apache .hadoop .fs .FileSystem ;
  4. import org .apache .hadoop .fs .Path ;
  5. import org .apache .hadoop .io .IntWritable ;
  6. import org .apache .hadoop .io .Text ;
  7. import org .apache .hadoop .mapreduce .Job ;
  8. import org .apache .hadoop .mapreduce .lib .input .FileInputFormat ;
  9. import org .apache .hadoop .mapreduce .lib .output .FileOutputFormat ;
  10. public class RunJob {
  11. public static void main( String[] args) {
  12. //设置环境变量HADOOP_USER_NAME,其值是root
  13. System .setProperty( "HADOOP_USER_NAME", "root") ;
  14. //Configuration类包含了Hadoop的配置
  15. Configuration config =new Configuration() ;
  16. //设置fs .defaultFS
  17. config .set( "fs.defaultFS", "hdfs://192.168.80.131:9000") ;
  18. //设置yarn .resourcemanager 节点
  19. config .set( "yarn.resourcemanager.hostname", "node1") ;
  20. try {
  21. FileSystem fs = FileSystem .get(config) ;
  22. Job job = Job .getInstance(config) ;
  23. job .setJarByClass( RunJob .class) ;
  24. job .setJobName( "wc") ;
  25. //设置Mapper类
  26. job .setMapperClass( WordCountMapper .class) ;
  27. //设置Reduce类
  28. job .setReducerClass( WordCountReducer .class) ;
  29. //设置reduce方法输出key的类型
  30. job .setOutputKeyClass( Text .class) ;
  31. //设置reduce方法输出value的类型
  32. job .setOutputValueClass( IntWritable .class) ;
  33. //指定输入路径
  34. FileInputFormat .addInputPath(job, new Path( "/user/root/input/")) ;
  35. //指定输出路径(会自动创建)
  36. Path outpath =new Path( "/user/root/output/") ;
  37. //输出路径是MapReduce自动创建的,如果存在则需要先删除
  38. if(fs .exists(outpath)){
  39. fs .delete(outpath, true) ;
  40. }
  41. FileOutputFormat .setOutputPath(job, outpath) ;
  42. //提交任务,等待执行完成
  43. boolean f= job .waitForCompletion( true) ;
  44. if(f){
  45. System .out .println( "job任务执行成功") ;
  46. }
  47. } catch ( Exception e) {
  48. e .printStackTrace() ;
  49. }
  50. }
  51. }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53

(6)本地运行

执行结果: 
这里写图片描述


     
     
  1. [root @node1 ~] # hdfs dfs -ls /user/root/output
  2. Found 2 items
  3. -rw-r--r-- 3 root supergroup 0 2017- 05- 28 09 : 01 /user/root/output/_SUCCESS
  4. -rw-r--r-- 3 root supergroup 46 2017- 05- 28 09 : 01 /user/root/output/part-r- 00000
  5. [root @node1 ~] # hdfs dfs -cat /user/root/output/part-r-00000
  6. Hadoop 2
  7. Hello 2
  8. Hi 1
  9. Java 2
  10. World 1
  11. world 1
  12. [root @node1 ~] #
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

6.2.4 服务端运行

(1)修改源码

上面代码中的主方法是根据本地运行设计的,如果要在服务器端运行,可以适当简化。 
参照官方源码 
http://hadoop.apache.org/docs/r2.7.3/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html

将Mapper类和Reducer类写成主类的静态内部类


     
     
  1. package cn.hadron.mr;
  2. import java.io.IOException;
  3. import java.util.StringTokenizer;
  4. import org.apache.hadoop.conf.Configuration;
  5. import org.apache.hadoop.fs.Path;
  6. import org.apache.hadoop.io.IntWritable;
  7. import org.apache.hadoop.io.Text;
  8. import org.apache.hadoop.mapreduce.Job;
  9. import org.apache.hadoop.mapreduce.Mapper;
  10. import org.apache.hadoop.mapreduce.Reducer;
  11. import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
  12. import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
  13. public class WordCount {
  14. //4种形式的参数,分别用来指定map的输入key值类型、输入value值类型、输出key值类型和输出value值类型
  15. public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable> {
  16. private final static IntWritable one = new IntWritable( 1);
  17. private Text word = new Text();
  18. //map方法中value值存储的是文本文件中的一行(以回车符为行结束标记),而key值为该行的首字母相对于文本文件的首地址的偏移量
  19. public void map( Object key, Text value, Context context) throws IOException, InterruptedException {
  20. StringTokenizer itr = new StringTokenizer(value. toString());
  21. //StringTokenizer类将每一行拆分成为一个个的单词,并将<word,1>作为map方法的结果输出
  22. while (itr.hasMoreTokens()) {
  23. word. set(itr.nextToken());
  24. context.write(word, one);
  25. }
  26. }
  27. }
  28. public static class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
  29. private IntWritable result = new IntWritable();
  30. //Map过程输出<key,values>中key为单个单词,而values是对应单词的计数值所组成的列表,Map的输出就是Reduce的输入,
  31. //所以reduce方法只要遍历values并求和,即可得到某个单词的总次数。
  32. public void reduce( Text key, Iterable< IntWritable> values, Context context)
  33. throws IOException, InterruptedException {
  34. int sum = 0;
  35. for ( IntWritable val : values) {
  36. sum += val. get();
  37. }
  38. result. set(sum);
  39. context.write(key, result);
  40. }
  41. }
  42. //执行MapReduce任务
  43. public static void main( String[] args) throws Exception {
  44. Configuration conf = new Configuration();
  45. Job job = Job.getInstance(conf, "wordCount");
  46. job.setJarByClass( WordCount. class);
  47. job.setMapperClass( TokenizerMapper. class);
  48. job.setCombinerClass( IntSumReducer. class);
  49. job.setReducerClass( IntSumReducer. class);
  50. job.setOutputKeyClass( Text. class);
  51. job.setOutputValueClass( IntWritable. class);
  52. //命令行输入的第一个参数是输入路径,第二个参数是输出路径
  53. FileInputFormat.addInputPath(job, new Path(args[ 0]));
  54. FileOutputFormat.setOutputPath(job, new Path(args[ 1]));
  55. System.exit(job.waitForCompletion( true) ? 0 : 1);
  56. }
  57. }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61

(2)导出jar包

这里写图片描述

(3)上传到服务器端运行 
和前面一样,通过xftp将刚刚导出到桌面的wordcount.jar包上传到node1节点 
这里写图片描述


     
     
  1. [root@node1 ~] # hadoop jar wordcount.jar cn.hadron.mr.WordCount input output
  2. 17/ 05/ 28 10 : 41 : 41 INFO client .RMProxy : Connecting to ResourceManager at node1/ 192.168 .80 .131 : 8032
  3. Exception in thread "main" org .apache .hadoop .mapred .FileAlreadyExistsException : Output directory hdfs:/ /node1: 9000 /user /root/output already exists
  4. at org .apache .hadoop .mapreduce .lib .output .FileOutputFormat .checkOutputSpecs(FileOutputFormat .java : 146)
  5. at org .apache .hadoop .mapreduce .JobSubmitter .checkSpecs(JobSubmitter .java : 266)
  6. at org .apache .hadoop .mapreduce .JobSubmitter .submitJobInternal(JobSubmitter .java : 139)
  7. at org .apache .hadoop .mapreduce .Job$10 .run(Job .java : 1290)
  8. at org .apache .hadoop .mapreduce .Job$10 .run(Job .java : 1287)
  9. at java .security .AccessController .doPrivileged(Native Method)
  10. at javax .security .auth .Subject .doAs(Subject .java : 422)
  11. at org .apache .hadoop .security .UserGroupInformation .doAs(UserGroupInformation .java : 1698)
  12. at org .apache .hadoop .mapreduce .Job .submit(Job .java : 1287)
  13. at org .apache .hadoop .mapreduce .Job .waitForCompletion(Job .java : 1308)
  14. at cn .hadron .mr .WordCount .main(WordCount .java : 59)
  15. at sun .reflect .NativeMethodAccessorImpl .invoke 0(Native Method)
  16. at sun .reflect .NativeMethodAccessorImpl .invoke(NativeMethodAccessorImpl .java : 62)
  17. at sun .reflect .DelegatingMethodAccessorImpl .invoke(DelegatingMethodAccessorImpl .java : 43)
  18. at java .lang .reflect .Method .invoke(Method .java : 498)
  19. at org .apache .hadoop .util .RunJar .run(RunJar .java : 221)
  20. at org .apache .hadoop .util .RunJar .main(RunJar .java : 136)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20

这是由于output目录已经存在,删除即可


     
     
  1. [root @node1 ~] # hdfs dfs -rmr /user/root/output
  2. rmr: DEPRECATED: Please use 'rm -r' instead.
  3. 17/ 05/ 28 10: 42: 01 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
  4. Deleted /user/root/output
  • 1
  • 2
  • 3
  • 4

重新运行


     
     
  1. [root@node1 ~]# hadoop jar wordcount.jar cn.hadron.mr.WordCount input output
  2. 17/ 05/ 28 10: 43: 12 INFO client.RMProxy: Connecting to ResourceManager at node1/ 192.168 .80 .131: 8032
  3. 17/ 05/ 28 10: 43: 14 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
  4. 17/ 05/ 28 10: 43: 15 INFO input.FileInputFormat: Total input paths to process : 2
  5. 17/ 05/ 28 10: 43: 15 INFO mapreduce.JobSubmitter: number of splits: 2
  6. 17/ 05/ 28 10: 43: 16 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1495804618534_0001
  7. 17/ 05/ 28 10: 43: 17 INFO impl.YarnClientImpl: Submitted application application_1495804618534_0001
  8. 17/ 05/ 28 10: 43: 17 INFO mapreduce.Job: The url to track the job: http: //node1: 8088 /proxy/application_1495804618534_0001/
  9. 17/ 05/ 28 10: 43: 17 INFO mapreduce.Job: Running job: job_1495804618534_0001
  10. 17/ 05/ 28 10: 43: 43 INFO mapreduce.Job: Job job_1495804618534_0001 running in uber mode : false
  11. 17/ 05/ 28 10: 43: 43 INFO mapreduce.Job: map 0% reduce 0%
  12. 17/ 05/ 28 10: 44: 19 INFO mapreduce.Job: map 100% reduce 0%
  13. 17/ 05/ 28 10: 44: 33 INFO mapreduce.Job: map 100% reduce 100%
  14. 17/ 05/ 28 10: 44: 35 INFO mapreduce.Job: Job job_1495804618534_0001 completed successfully
  15. 17/ 05/ 28 10: 44: 36 INFO mapreduce.Job: Counters: 50
  16. File System Counters
  17. FILE: Number of bytes read= 89
  18. FILE: Number of bytes written= 355368
  19. FILE: Number of read operations= 0
  20. FILE: Number of large read operations= 0
  21. FILE: Number of write operations= 0
  22. HDFS: Number of bytes read= 301
  23. HDFS: Number of bytes written= 46
  24. HDFS: Number of read operations= 9
  25. HDFS: Number of large read operations= 0
  26. HDFS: Number of write operations= 2
  27. Job Counters
  28. Killed map tasks= 1
  29. Launched map tasks= 2
  30. Launched reduce tasks= 1
  31. Data- local map tasks= 2
  32. Total time spent by all maps in occupied slots (ms)= 62884
  33. Total time spent by all reduces in occupied slots (ms)= 12445
  34. Total time spent by all map tasks (ms)= 62884
  35. Total time spent by all reduce tasks (ms)= 12445
  36. Total vcore-milliseconds taken by all map tasks= 62884
  37. Total vcore-milliseconds taken by all reduce tasks= 12445
  38. Total megabyte-milliseconds taken by all map tasks= 64393216
  39. Total megabyte-milliseconds taken by all reduce tasks= 12743680
  40. Map-Reduce Framework
  41. Map input records= 6
  42. Map output records= 14
  43. Map output bytes= 140
  44. Map output materialized bytes= 95
  45. Input split bytes= 216
  46. Combine input records= 14
  47. Combine output records= 7
  48. Reduce input groups= 6
  49. Reduce shuffle bytes= 95
  50. Reduce input records= 7
  51. Reduce output records= 6
  52. Spilled Records= 14
  53. Shuffled Maps = 2
  54. Failed Shuffles= 0
  55. Merged Map outputs= 2
  56. GC time elapsed (ms)= 860
  57. CPU time spent (ms)= 10230
  58. Physical memory (bytes) snapshot= 503312384
  59. Virtual memory (bytes) snapshot= 6236766208
  60. Total committed heap usage (bytes)= 301146112
  61. Shuffle Errors
  62. BAD_ID= 0
  63. CONNECTION= 0
  64. IO_ERROR= 0
  65. WRONG_LENGTH= 0
  66. WRONG_MAP= 0
  67. WRONG_REDUCE= 0
  68. File Input Format Counters
  69. Bytes Read= 85
  70. File Output Format Counters
  71. Bytes Written= 46
  72. [root@node1 ~]#
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72

查看结果


     
     
  1. [ root@node1 ~ ] # hdfs dfs - ls /user/root/output
  2. Found 2 items
  3. - rw - r - - r - - 3 root supergroup 0 2017 - 05 - 28 10:44 /user/root/output/_SUCCESS
  4. - rw - r - - r - - 3 root supergroup 46 2017 - 05 - 28 10:44 /user/root/output/part - r - 00000
  5. [ root@node1 ~ ] # hdfs dfs - cat /user/root/output/part - r - 00000
  6. Hadoop 2
  7. Hello 2
  8. Hi 1
  9. Java 2
  10. World 1
  11. world 1
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

问题补充

2017-06-24 
今天再次运行之前写的MapReduce程序时,报错:

(null) entry in command string: null chmod 0700
     
     
  • 1

解决办法: 
(1)下载hadoop-2.7.3.tar.gz,解压缩。比如解压缩到D盘,hadoop根目录就是D:\hadoop-2.7.3 
(2)拷贝debug工具(winutils.exe)到HADOOP_HOME/bin 
这里写图片描述
(3)设置环境变量 
这里写图片描述

这里写图片描述

灰常灰常感谢原博主的辛苦工作,为防止删博,所以转载,只供学习使用,不做其他任何商业用途。 https://blog.csdn.net/chengyuqiang/article/details/72794026
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值