Hadoop学习笔记(二)设置单节点集群

分享一下我老师大神的人工智能教程!零基础,通俗易懂!http://blog.csdn.net/jiangjunshow

也欢迎大家转载本篇文章。分享知识,造福人民,实现我们中华民族伟大复兴!

               

描述如何设置一个单一节点的 Hadoop 安装以便可以快速执行简单操作使用 Hadoop MapReduce  Hadoop 分布式文件系统 (HDFS)。

参考官方文档:Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.

Hadoop版本:Apache Hadoop 2.5.1

系统版本:CentOS 6.5,内核(uname -r):2.6.32-431.el6.x86_64

系统必备组件

支持的系统平台

GNU/Linux 作为开发生产平台,毫无疑问。Windows 受支持平台但是以下步骤用于 Linux。

依赖的软件

在Linux系统上安装所需要的软件包

1、JAVA(JDK)必须安装,推荐的版本请参考Hadoop JAVA Version,我这里安装的是1.7。

2、ssh 必须安装必须运行 sshd 才能使用管理远程 Hadoop 守护程序 Hadoop 脚本

安装依赖的软件

如果系统没有软件需要安装

例如在Ubuntu Linux上使用以下命令:

  $ sudo apt-get install ssh  $ sudo apt-get install rsync

CentOS应该是即使是最小安装也带了ssh(Secure Shell),刚开始我给弄混了,以为是JAVA的SSH(Spring + Struts +Hibernate),汗!尴尬

安装JDK,参考:CentOS下安装JDK7

下载

就不多说了,上一篇下过了。链接:Hadoop学习笔记(一)从官网下载安装包

准备启动 Hadoop 集群

解压文件hadoop-2.5.1.tar.gz,执行:tar xvf hadoop-2.5.1.tar.gz,会将文件解压到hadoop-2.5.1目录下;

切换目录:cd hadoop-2.5.1/etc/hadoop/

编辑“hadoop-env.sh”文件,添加参考定义;

vi hadoop-env.sh

个人觉得比较好的习惯是编辑文件之前先做个备份(cp hadoop-env.sh hadoop-env.sh.bak);

找到以下位置:

# The java implementation to use.export JAVA_HOME={JAVA_HOME}
将其改为:

# The java implementation to use.export JAVA_HOME=/usr/java/latest
在下面再添加一句:

# Assuming your installation directory is /usr/local/hadoopexport HADOOP_PREFIX=/usr/local/hadoop
保存并退出,ESC,:wq

切换目录(cd ../..),返回“/opt/hadoop-2.5.1”;

尝试执行以下命令:

 ./bin/hadoop

这将显示 hadoop 脚本的使用文档,输出如下:

Usage: hadoop [--config confdir] COMMAND       where COMMAND is one of:  fs                   run a generic filesystem user client  version              print the version  jar <jar>            run a jar file  checknative [-a|-h]  check native hadoop and compression libraries availability  distcp <srcurl> <desturl> copy file or directories recursively  archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive  classpath            prints the class path needed to get the                       Hadoop jar and the required libraries  daemonlog            get/set the log level for each daemon or  CLASSNAME            run the class named CLASSNAMEMost commands print help when invoked w/o parameters.
你现在准备好开始您的 Hadoop 集群三个受支持的模式之一:
  • 本地 (独立) 模式
  • 伪分布的模式
  • 完全分布式模式

本地模式操作方法

默认情况下,Hadoop 被配置为运行在非分布式模式下,作为一个单一的 Java 进程。这比较适合用于调试。
下面的示例复制要使用作为输入的解压缩的 conf 目录,然后查找并显示给定正则表达式的每一场比赛。输出被写入给定的输出目录。

  $ mkdir input  $ cp etc/hadoop/*.xml input  $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar grep input output 'dfs[a-z.]+'  $ cat output/*

执行“bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar grep input output 'dfs[a-z.]+'”时

却出现错误:Error: Could not find or load main class org.apache.hadoop.util.RunJar

此问题只在Stack Overflow上见到

What does “Error: Could not find or load main class org.apache.hadoop.util.RunJar”?

但是也没能找到解决的办法;还是自己摸索吧!

解决步骤:

刚刚备份的“hadoop-env.sh”文件现在用上了,还原它。

再执行“bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar grep input output 'dfs[a-z.]+'”,

提示:

./bin/hadoop: line 133: /usr/java/jdk1.7.0/bin/java: No such file or directory./bin/hadoop: line 133: exec: /usr/java/jdk1.7.0/bin/java: cannot execute: No such file or directory
按提示应该还是JAVA(JDK)的安装的问题,我安装JDK的时候只执行到
rpm -ivh /目录/jdk-7-linux-x64.rpm

再没执行其它操作,将后续的步骤执行完成后,再执行“bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar grep input output 'dfs[a-z.]+'”,

输出:

14/10/07 03:35:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable14/10/07 03:35:58 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id14/10/07 03:35:58 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=14/10/07 03:35:59 WARN mapreduce.JobSubmitter: No job jar file set.  User classes may not be found. See Job or Job#setJar(String).14/10/07 03:35:59 INFO input.FileInputFormat: Total input paths to process : 614/10/07 03:35:59 INFO mapreduce.JobSubmitter: number of splits:614/10/07 03:36:00 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1185570365_000114/10/07 03:36:00 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/staging/root1185570365/.staging/job_local1185570365_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.14/10/07 03:36:01 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/staging/root1185570365/.staging/job_local1185570365_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.14/10/07 03:36:01 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/local/localRunner/root/job_local1185570365_0001/job_local1185570365_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.14/10/07 03:36:01 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/local/localRunner/root/job_local1185570365_0001/job_local1185570365_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.14/10/07 03:36:01 INFO mapreduce.Job: The url to track the job: http://localhost:8080/14/10/07 03:36:01 INFO mapreduce.Job: Running job: job_local1185570365_000114/10/07 03:36:01 INFO mapred.LocalJobRunner: OutputCommitter set in config null14/10/07 03:36:01 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter14/10/07 03:36:02 INFO mapred.LocalJobRunner: Waiting for map tasks14/10/07 03:36:02 INFO mapred.LocalJobRunner: Starting task: attempt_local1185570365_0001_m_000000_014/10/07 03:36:02 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]14/10/07 03:36:02 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/hadoop-policy.xml:0+920114/10/07 03:36:02 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer14/10/07 03:36:02 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)14/10/07 03:36:02 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 10014/10/07 03:36:02 INFO mapred.MapTask: soft limit at 8388608014/10/07 03:36:02 INFO mapred.MapTask: bufstart = 0; bufvoid = 10485760014/10/07 03:36:02 INFO mapred.MapTask: kvstart = 26214396; length = 655360014/10/07 03:36:02 INFO mapred.LocalJobRunner: 14/10/07 03:36:02 INFO mapred.MapTask: Starting flush of map output14/10/07 03:36:02 INFO mapred.MapTask: Spilling map output14/10/07 03:36:02 INFO mapred.MapTask: bufstart = 0; bufend = 17; bufvoid = 10485760014/10/07 03:36:02 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214396(104857584); length = 1/655360014/10/07 03:36:02 INFO mapreduce.Job: Job job_local1185570365_0001 running in uber mode : false14/10/07 03:36:02 INFO mapred.MapTask: Finished spill 014/10/07 03:36:02 INFO mapreduce.Job:  map 0% reduce 0%14/10/07 03:36:02 INFO mapred.Task: Task:attempt_local1185570365_0001_m_000000_0 is done. And is in the process of committing14/10/07 03:36:02 INFO mapred.LocalJobRunner: map14/10/07 03:36:02 INFO mapred.Task: Task 'attempt_local1185570365_0001_m_000000_0' done.14/10/07 03:36:02 INFO mapred.LocalJobRunner: Finishing task: attempt_local1185570365_0001_m_000000_014/10/07 03:36:02 INFO mapred.LocalJobRunner: Starting task: attempt_local1185570365_0001_m_000001_014/10/07 03:36:02 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]14/10/07 03:36:02 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/capacity-scheduler.xml:0+358914/10/07 03:36:02 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer14/10/07 03:36:02 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)14/10/07 03:36:02 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 10014/10/07 03:36:02 INFO mapred.MapTask: soft limit at 8388608014/10/07 03:36:02 INFO mapred.MapTask: bufstart = 0; bufvoid = 10485760014/10/07 03:36:02 INFO mapred.MapTask: kvstart = 26214396; length = 655360014/10/07 03:36:02 INFO mapred.LocalJobRunner: 14/10/07 03:36:02 INFO mapred.MapTask: Starting flush of map output14/10/07 03:36:02 INFO mapred.Task: Task:attempt_local1185570365_0001_m_000001_0 is done. And is in the process of committing14/10/07 03:36:02 INFO mapred.LocalJobRunner: map14/10/07 03:36:02 INFO mapred.Task: Task 'attempt_local1185570365_0001_m_000001_0' done.14/10/07 03:36:02 INFO mapred.LocalJobRunner: Finishing task: attempt_local1185570365_0001_m_000001_014/10/07 03:36:02 INFO mapred.LocalJobRunner: Starting task: attempt_local1185570365_0001_m_000002_014/10/07 03:36:02 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]14/10/07 03:36:02 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/hdfs-site.xml:0+77514/10/07 03:36:02 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer14/10/07 03:36:03 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)14/10/07 03:36:03 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 10014/10/07 03:36:03 INFO mapred.MapTask: soft limit at 8388608014/10/07 03:36:03 INFO mapred.MapTask: bufstart = 0; bufvoid = 10485760014/10/07 03:36:03 INFO mapred.MapTask: kvstart = 26214396; length = 655360014/10/07 03:36:03 INFO mapred.LocalJobRunner: 14/10/07 03:36:03 INFO mapred.MapTask: Starting flush of map output14/10/07 03:36:03 INFO mapred.Task: Task:attempt_local1185570365_0001_m_000002_0 is done. And is in the process of committing14/10/07 03:36:03 INFO mapred.LocalJobRunner: map14/10/07 03:36:03 INFO mapred.Task: Task 'attempt_local1185570365_0001_m_000002_0' done.14/10/07 03:36:03 INFO mapred.LocalJobRunner: Finishing task: attempt_local1185570365_0001_m_000002_014/10/07 03:36:03 INFO mapred.LocalJobRunner: Starting task: attempt_local1185570365_0001_m_000003_014/10/07 03:36:03 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]14/10/07 03:36:03 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/core-site.xml:0+77414/10/07 03:36:03 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer14/10/07 03:36:03 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)14/10/07 03:36:03 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 10014/10/07 03:36:03 INFO mapred.MapTask: soft limit at 8388608014/10/07 03:36:03 INFO mapred.MapTask: bufstart = 0; bufvoid = 10485760014/10/07 03:36:03 INFO mapred.MapTask: kvstart = 26214396; length = 655360014/10/07 03:36:03 INFO mapred.LocalJobRunner: 14/10/07 03:36:03 INFO mapred.MapTask: Starting flush of map output14/10/07 03:36:03 INFO mapred.Task: Task:attempt_local1185570365_0001_m_000003_0 is done. And is in the process of committing14/10/07 03:36:03 INFO mapred.LocalJobRunner: map14/10/07 03:36:03 INFO mapred.Task: Task 'attempt_local1185570365_0001_m_000003_0' done.14/10/07 03:36:03 INFO mapred.LocalJobRunner: Finishing task: attempt_local1185570365_0001_m_000003_014/10/07 03:36:03 INFO mapred.LocalJobRunner: Starting task: attempt_local1185570365_0001_m_000004_014/10/07 03:36:03 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]14/10/07 03:36:03 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/yarn-site.xml:0+69014/10/07 03:36:03 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer14/10/07 03:36:03 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)14/10/07 03:36:03 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 10014/10/07 03:36:03 INFO mapred.MapTask: soft limit at 8388608014/10/07 03:36:03 INFO mapred.MapTask: bufstart = 0; bufvoid = 10485760014/10/07 03:36:03 INFO mapred.MapTask: kvstart = 26214396; length = 655360014/10/07 03:36:03 INFO mapred.LocalJobRunner: 14/10/07 03:36:03 INFO mapred.MapTask: Starting flush of map output14/10/07 03:36:03 INFO mapred.Task: Task:attempt_local1185570365_0001_m_000004_0 is done. And is in the process of committing14/10/07 03:36:03 INFO mapred.LocalJobRunner: map14/10/07 03:36:03 INFO mapred.Task: Task 'attempt_local1185570365_0001_m_000004_0' done.14/10/07 03:36:03 INFO mapred.LocalJobRunner: Finishing task: attempt_local1185570365_0001_m_000004_014/10/07 03:36:03 INFO mapred.LocalJobRunner: Starting task: attempt_local1185570365_0001_m_000005_014/10/07 03:36:03 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]14/10/07 03:36:03 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/httpfs-site.xml:0+62014/10/07 03:36:03 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer14/10/07 03:36:03 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)14/10/07 03:36:03 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 10014/10/07 03:36:03 INFO mapred.MapTask: soft limit at 8388608014/10/07 03:36:03 INFO mapred.MapTask: bufstart = 0; bufvoid = 10485760014/10/07 03:36:03 INFO mapred.MapTask: kvstart = 26214396; length = 655360014/10/07 03:36:03 INFO mapred.LocalJobRunner: 14/10/07 03:36:03 INFO mapred.MapTask: Starting flush of map output14/10/07 03:36:03 INFO mapred.Task: Task:attempt_local1185570365_0001_m_000005_0 is done. And is in the process of committing14/10/07 03:36:03 INFO mapred.LocalJobRunner: map14/10/07 03:36:03 INFO mapred.Task: Task 'attempt_local1185570365_0001_m_000005_0' done.14/10/07 03:36:03 INFO mapred.LocalJobRunner: Finishing task: attempt_local1185570365_0001_m_000005_014/10/07 03:36:03 INFO mapred.LocalJobRunner: map task executor complete.14/10/07 03:36:03 INFO mapred.LocalJobRunner: Waiting for reduce tasks14/10/07 03:36:03 INFO mapred.LocalJobRunner: Starting task: attempt_local1185570365_0001_r_000000_014/10/07 03:36:03 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]14/10/07 03:36:03 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@57931be214/10/07 03:36:03 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=363285696, maxSingleShuffleLimit=90821424, mergeThreshold=239768576, ioSortFactor=10, memToMemMergeOutputsThreshold=1014/10/07 03:36:03 INFO reduce.EventFetcher: attempt_local1185570365_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events14/10/07 03:36:03 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1185570365_0001_m_000001_0 decomp: 2 len: 6 to MEMORY14/10/07 03:36:03 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1185570365_0001_m_000001_014/10/07 03:36:03 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->214/10/07 03:36:03 INFO mapreduce.Job:  map 100% reduce 0%14/10/07 03:36:03 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1185570365_0001_m_000004_0 decomp: 2 len: 6 to MEMORY14/10/07 03:36:03 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1185570365_0001_m_000004_014/10/07 03:36:03 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 2, commitMemory -> 2, usedMemory ->414/10/07 03:36:03 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1185570365_0001_m_000005_0 decomp: 2 len: 6 to MEMORY14/10/07 03:36:03 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1185570365_0001_m_000005_014/10/07 03:36:03 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 3, commitMemory -> 4, usedMemory ->614/10/07 03:36:03 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1185570365_0001_m_000002_0 decomp: 2 len: 6 to MEMORY14/10/07 03:36:03 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1185570365_0001_m_000002_014/10/07 03:36:03 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 4, commitMemory -> 6, usedMemory ->814/10/07 03:36:03 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1185570365_0001_m_000003_0 decomp: 2 len: 6 to MEMORY14/10/07 03:36:03 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1185570365_0001_m_000003_014/10/07 03:36:03 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 5, commitMemory -> 8, usedMemory ->1014/10/07 03:36:03 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1185570365_0001_m_000000_0 decomp: 21 len: 25 to MEMORY14/10/07 03:36:03 INFO reduce.InMemoryMapOutput: Read 21 bytes from map-output for attempt_local1185570365_0001_m_000000_014/10/07 03:36:03 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 21, inMemoryMapOutputs.size() -> 6, commitMemory -> 10, usedMemory ->3114/10/07 03:36:03 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning14/10/07 03:36:03 INFO mapred.LocalJobRunner: 6 / 6 copied.14/10/07 03:36:03 INFO reduce.MergeManagerImpl: finalMerge called with 6 in-memory map-outputs and 0 on-disk map-outputs14/10/07 03:36:03 INFO mapred.Merger: Merging 6 sorted segments14/10/07 03:36:03 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 10 bytes14/10/07 03:36:03 INFO reduce.MergeManagerImpl: Merged 6 segments, 31 bytes to disk to satisfy reduce memory limit14/10/07 03:36:03 INFO reduce.MergeManagerImpl: Merging 1 files, 25 bytes from disk14/10/07 03:36:03 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce14/10/07 03:36:03 INFO mapred.Merger: Merging 1 sorted segments14/10/07 03:36:03 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 10 bytes14/10/07 03:36:03 INFO mapred.LocalJobRunner: 6 / 6 copied.14/10/07 03:36:04 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords14/10/07 03:36:04 INFO mapred.Task: Task:attempt_local1185570365_0001_r_000000_0 is done. And is in the process of committing14/10/07 03:36:04 INFO mapred.LocalJobRunner: 6 / 6 copied.14/10/07 03:36:04 INFO mapred.Task: Task attempt_local1185570365_0001_r_000000_0 is allowed to commit now14/10/07 03:36:04 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1185570365_0001_r_000000_0' to file:/opt/hadoop-2.5.1/grep-temp-767563685/_temporary/0/task_local1185570365_0001_r_00000014/10/07 03:36:04 INFO mapred.LocalJobRunner: reduce > reduce14/10/07 03:36:04 INFO mapred.Task: Task 'attempt_local1185570365_0001_r_000000_0' done.14/10/07 03:36:04 INFO mapred.LocalJobRunner: Finishing task: attempt_local1185570365_0001_r_000000_014/10/07 03:36:04 INFO mapred.LocalJobRunner: reduce task executor complete.14/10/07 03:36:04 INFO mapreduce.Job:  map 100% reduce 100%14/10/07 03:36:04 INFO mapreduce.Job: Job job_local1185570365_0001 completed successfully14/10/07 03:36:04 INFO mapreduce.Job: Counters: 33 File System Counters  FILE: Number of bytes read=114663  FILE: Number of bytes written=1613316  FILE: Number of read operations=0  FILE: Number of large read operations=0  FILE: Number of write operations=0 Map-Reduce Framework  Map input records=405  Map output records=1  Map output bytes=17  Map output materialized bytes=55  Input split bytes=657  Combine input records=1  Combine output records=1  Reduce input groups=1  Reduce shuffle bytes=55  Reduce input records=1  Reduce output records=1  Spilled Records=2  Shuffled Maps =6  Failed Shuffles=0  Merged Map outputs=6  GC time elapsed (ms)=225  CPU time spent (ms)=0  Physical memory (bytes) snapshot=0  Virtual memory (bytes) snapshot=0  Total committed heap usage (bytes)=1106100224 Shuffle Errors  BAD_ID=0  CONNECTION=0  IO_ERROR=0  WRONG_LENGTH=0  WRONG_MAP=0  WRONG_REDUCE=0 File Input Format Counters   Bytes Read=15649 File Output Format Counters   Bytes Written=12314/10/07 03:36:04 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initializedorg.apache.hadoop.mapred.FileAlreadyExistsException: Output directory file:/opt/hadoop-2.5.1/output already exists at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:458) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303) at org.apache.hadoop.examples.Grep.run(Grep.java:92) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.examples.Grep.main(Grep.java:101) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Output directory file:/opt/hadoop-2.5.1/output already exists,噢,原因是output目录已经存在了(之前我排查问题的时候创建的);

删除output目录(rm -rf output);

再执行“bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar grep input output 'dfs[a-z.]+'”命令,输出如下:

14/10/08 05:57:34 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable14/10/08 05:57:35 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id14/10/08 05:57:35 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=14/10/08 05:57:36 WARN mapreduce.JobSubmitter: No job jar file set.  User classes may not be found. See Job or Job#setJar(String).14/10/08 05:57:36 INFO input.FileInputFormat: Total input paths to process : 614/10/08 05:57:36 INFO mapreduce.JobSubmitter: number of splits:614/10/08 05:57:37 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local380762736_000114/10/08 05:57:37 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/staging/root380762736/.staging/job_local380762736_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.14/10/08 05:57:37 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/staging/root380762736/.staging/job_local380762736_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.14/10/08 05:57:38 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/local/localRunner/root/job_local380762736_0001/job_local380762736_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.14/10/08 05:57:38 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/local/localRunner/root/job_local380762736_0001/job_local380762736_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.14/10/08 05:57:38 INFO mapreduce.Job: The url to track the job: http://localhost:8080/14/10/08 05:57:38 INFO mapreduce.Job: Running job: job_local380762736_000114/10/08 05:57:38 INFO mapred.LocalJobRunner: OutputCommitter set in config null14/10/08 05:57:38 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter14/10/08 05:57:38 INFO mapred.LocalJobRunner: Waiting for map tasks14/10/08 05:57:38 INFO mapred.LocalJobRunner: Starting task: attempt_local380762736_0001_m_000000_014/10/08 05:57:39 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]14/10/08 05:57:39 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/hadoop-policy.xml:0+920114/10/08 05:57:39 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer14/10/08 05:57:39 INFO mapreduce.Job: Job job_local380762736_0001 running in uber mode : false14/10/08 05:57:39 INFO mapreduce.Job:  map 0% reduce 0%14/10/08 05:57:43 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)14/10/08 05:57:43 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 10014/10/08 05:57:43 INFO mapred.MapTask: soft limit at 8388608014/10/08 05:57:43 INFO mapred.MapTask: bufstart = 0; bufvoid = 10485760014/10/08 05:57:43 INFO mapred.MapTask: kvstart = 26214396; length = 655360014/10/08 05:57:44 INFO mapred.LocalJobRunner: 14/10/08 05:57:44 INFO mapred.MapTask: Starting flush of map output14/10/08 05:57:44 INFO mapred.MapTask: Spilling map output14/10/08 05:57:44 INFO mapred.MapTask: bufstart = 0; bufend = 17; bufvoid = 10485760014/10/08 05:57:44 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214396(104857584); length = 1/655360014/10/08 05:57:44 INFO mapred.MapTask: Finished spill 014/10/08 05:57:44 INFO mapred.Task: Task:attempt_local380762736_0001_m_000000_0 is done. And is in the process of committing14/10/08 05:57:45 INFO mapred.LocalJobRunner: map14/10/08 05:57:45 INFO mapred.Task: Task 'attempt_local380762736_0001_m_000000_0' done.14/10/08 05:57:45 INFO mapred.LocalJobRunner: Finishing task: attempt_local380762736_0001_m_000000_014/10/08 05:57:45 INFO mapred.LocalJobRunner: Starting task: attempt_local380762736_0001_m_000001_014/10/08 05:57:45 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]14/10/08 05:57:45 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/capacity-scheduler.xml:0+358914/10/08 05:57:45 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer14/10/08 05:57:45 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)14/10/08 05:57:45 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 10014/10/08 05:57:45 INFO mapred.MapTask: soft limit at 8388608014/10/08 05:57:45 INFO mapred.MapTask: bufstart = 0; bufvoid = 10485760014/10/08 05:57:45 INFO mapred.MapTask: kvstart = 26214396; length = 655360014/10/08 05:57:45 INFO mapred.LocalJobRunner: 14/10/08 05:57:45 INFO mapred.MapTask: Starting flush of map output14/10/08 05:57:45 INFO mapred.Task: Task:attempt_local380762736_0001_m_000001_0 is done. And is in the process of committing14/10/08 05:57:45 INFO mapred.LocalJobRunner: map14/10/08 05:57:45 INFO mapred.Task: Task 'attempt_local380762736_0001_m_000001_0' done.14/10/08 05:57:45 INFO mapred.LocalJobRunner: Finishing task: attempt_local380762736_0001_m_000001_014/10/08 05:57:45 INFO mapred.LocalJobRunner: Starting task: attempt_local380762736_0001_m_000002_014/10/08 05:57:45 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]14/10/08 05:57:45 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/hdfs-site.xml:0+77514/10/08 05:57:45 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer14/10/08 05:57:46 INFO mapreduce.Job:  map 100% reduce 0%14/10/08 05:57:46 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)14/10/08 05:57:46 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 10014/10/08 05:57:46 INFO mapred.MapTask: soft limit at 8388608014/10/08 05:57:46 INFO mapred.MapTask: bufstart = 0; bufvoid = 10485760014/10/08 05:57:46 INFO mapred.MapTask: kvstart = 26214396; length = 655360014/10/08 05:57:46 INFO mapred.LocalJobRunner: 14/10/08 05:57:46 INFO mapred.MapTask: Starting flush of map output14/10/08 05:57:46 INFO mapred.Task: Task:attempt_local380762736_0001_m_000002_0 is done. And is in the process of committing14/10/08 05:57:46 INFO mapred.LocalJobRunner: map14/10/08 05:57:46 INFO mapred.Task: Task 'attempt_local380762736_0001_m_000002_0' done.14/10/08 05:57:46 INFO mapred.LocalJobRunner: Finishing task: attempt_local380762736_0001_m_000002_014/10/08 05:57:46 INFO mapred.LocalJobRunner: Starting task: attempt_local380762736_0001_m_000003_014/10/08 05:57:46 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]14/10/08 05:57:46 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/core-site.xml:0+77414/10/08 05:57:46 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer14/10/08 05:57:47 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)14/10/08 05:57:47 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 10014/10/08 05:57:47 INFO mapred.MapTask: soft limit at 8388608014/10/08 05:57:47 INFO mapred.MapTask: bufstart = 0; bufvoid = 10485760014/10/08 05:57:47 INFO mapred.MapTask: kvstart = 26214396; length = 655360014/10/08 05:57:47 INFO mapred.LocalJobRunner: 14/10/08 05:57:47 INFO mapred.MapTask: Starting flush of map output14/10/08 05:57:47 INFO mapred.Task: Task:attempt_local380762736_0001_m_000003_0 is done. And is in the process of committing14/10/08 05:57:47 INFO mapred.LocalJobRunner: map14/10/08 05:57:47 INFO mapred.Task: Task 'attempt_local380762736_0001_m_000003_0' done.14/10/08 05:57:47 INFO mapred.LocalJobRunner: Finishing task: attempt_local380762736_0001_m_000003_014/10/08 05:57:47 INFO mapred.LocalJobRunner: Starting task: attempt_local380762736_0001_m_000004_014/10/08 05:57:47 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]14/10/08 05:57:47 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/yarn-site.xml:0+69014/10/08 05:57:47 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer14/10/08 05:57:49 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)14/10/08 05:57:49 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 10014/10/08 05:57:49 INFO mapred.MapTask: soft limit at 8388608014/10/08 05:57:49 INFO mapred.MapTask: bufstart = 0; bufvoid = 10485760014/10/08 05:57:49 INFO mapred.MapTask: kvstart = 26214396; length = 655360014/10/08 05:57:49 INFO mapred.LocalJobRunner: 14/10/08 05:57:49 INFO mapred.MapTask: Starting flush of map output14/10/08 05:57:49 INFO mapred.Task: Task:attempt_local380762736_0001_m_000004_0 is done. And is in the process of committing14/10/08 05:57:49 INFO mapred.LocalJobRunner: map14/10/08 05:57:49 INFO mapred.Task: Task 'attempt_local380762736_0001_m_000004_0' done.14/10/08 05:57:49 INFO mapred.LocalJobRunner: Finishing task: attempt_local380762736_0001_m_000004_014/10/08 05:57:49 INFO mapred.LocalJobRunner: Starting task: attempt_local380762736_0001_m_000005_014/10/08 05:57:49 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]14/10/08 05:57:49 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/input/httpfs-site.xml:0+62014/10/08 05:57:49 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer14/10/08 05:57:49 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)14/10/08 05:57:49 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 10014/10/08 05:57:49 INFO mapred.MapTask: soft limit at 8388608014/10/08 05:57:49 INFO mapred.MapTask: bufstart = 0; bufvoid = 10485760014/10/08 05:57:49 INFO mapred.MapTask: kvstart = 26214396; length = 655360014/10/08 05:57:49 INFO mapred.LocalJobRunner: 14/10/08 05:57:49 INFO mapred.MapTask: Starting flush of map output14/10/08 05:57:49 INFO mapred.Task: Task:attempt_local380762736_0001_m_000005_0 is done. And is in the process of committing14/10/08 05:57:49 INFO mapred.LocalJobRunner: map14/10/08 05:57:49 INFO mapred.Task: Task 'attempt_local380762736_0001_m_000005_0' done.14/10/08 05:57:49 INFO mapred.LocalJobRunner: Finishing task: attempt_local380762736_0001_m_000005_014/10/08 05:57:49 INFO mapred.LocalJobRunner: map task executor complete.14/10/08 05:57:49 INFO mapred.LocalJobRunner: Waiting for reduce tasks14/10/08 05:57:49 INFO mapred.LocalJobRunner: Starting task: attempt_local380762736_0001_r_000000_014/10/08 05:57:49 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]14/10/08 05:57:49 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@6d36df0814/10/08 05:57:50 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=363285696, maxSingleShuffleLimit=90821424, mergeThreshold=239768576, ioSortFactor=10, memToMemMergeOutputsThreshold=1014/10/08 05:57:50 INFO reduce.EventFetcher: attempt_local380762736_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events14/10/08 05:57:50 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local380762736_0001_m_000000_0 decomp: 21 len: 25 to MEMORY14/10/08 05:57:50 INFO reduce.InMemoryMapOutput: Read 21 bytes from map-output for attempt_local380762736_0001_m_000000_014/10/08 05:57:50 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 21, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->2114/10/08 05:57:50 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local380762736_0001_m_000004_0 decomp: 2 len: 6 to MEMORY14/10/08 05:57:50 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local380762736_0001_m_000004_014/10/08 05:57:50 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 2, commitMemory -> 21, usedMemory ->2314/10/08 05:57:50 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local380762736_0001_m_000003_0 decomp: 2 len: 6 to MEMORY14/10/08 05:57:50 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local380762736_0001_m_000003_014/10/08 05:57:50 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 3, commitMemory -> 23, usedMemory ->2514/10/08 05:57:50 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local380762736_0001_m_000005_0 decomp: 2 len: 6 to MEMORY14/10/08 05:57:50 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local380762736_0001_m_000005_014/10/08 05:57:50 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 4, commitMemory -> 25, usedMemory ->2714/10/08 05:57:50 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local380762736_0001_m_000001_0 decomp: 2 len: 6 to MEMORY14/10/08 05:57:50 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local380762736_0001_m_000001_014/10/08 05:57:50 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 5, commitMemory -> 27, usedMemory ->2914/10/08 05:57:50 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local380762736_0001_m_000002_0 decomp: 2 len: 6 to MEMORY14/10/08 05:57:50 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local380762736_0001_m_000002_014/10/08 05:57:50 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 6, commitMemory -> 29, usedMemory ->3114/10/08 05:57:50 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning14/10/08 05:57:50 INFO mapred.LocalJobRunner: 6 / 6 copied.14/10/08 05:57:50 INFO reduce.MergeManagerImpl: finalMerge called with 6 in-memory map-outputs and 0 on-disk map-outputs14/10/08 05:57:50 INFO mapred.Merger: Merging 6 sorted segments14/10/08 05:57:50 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 10 bytes14/10/08 05:57:50 INFO reduce.MergeManagerImpl: Merged 6 segments, 31 bytes to disk to satisfy reduce memory limit14/10/08 05:57:50 INFO reduce.MergeManagerImpl: Merging 1 files, 25 bytes from disk14/10/08 05:57:50 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce14/10/08 05:57:50 INFO mapred.Merger: Merging 1 sorted segments14/10/08 05:57:50 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 10 bytes14/10/08 05:57:50 INFO mapred.LocalJobRunner: 6 / 6 copied.14/10/08 05:57:50 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords14/10/08 05:57:50 INFO mapred.Task: Task:attempt_local380762736_0001_r_000000_0 is done. And is in the process of committing14/10/08 05:57:50 INFO mapred.LocalJobRunner: 6 / 6 copied.14/10/08 05:57:50 INFO mapred.Task: Task attempt_local380762736_0001_r_000000_0 is allowed to commit now14/10/08 05:57:50 INFO output.FileOutputCommitter: Saved output of task 'attempt_local380762736_0001_r_000000_0' to file:/opt/hadoop-2.5.1/grep-temp-913340630/_temporary/0/task_local380762736_0001_r_00000014/10/08 05:57:50 INFO mapred.LocalJobRunner: reduce > reduce14/10/08 05:57:50 INFO mapred.Task: Task 'attempt_local380762736_0001_r_000000_0' done.14/10/08 05:57:50 INFO mapred.LocalJobRunner: Finishing task: attempt_local380762736_0001_r_000000_014/10/08 05:57:50 INFO mapred.LocalJobRunner: reduce task executor complete.14/10/08 05:57:51 INFO mapreduce.Job:  map 100% reduce 100%14/10/08 05:57:51 INFO mapreduce.Job: Job job_local380762736_0001 completed successfully14/10/08 05:57:51 INFO mapreduce.Job: Counters: 33 File System Counters  FILE: Number of bytes read=114663  FILE: Number of bytes written=1604636  FILE: Number of read operations=0  FILE: Number of large read operations=0  FILE: Number of write operations=0 Map-Reduce Framework  Map input records=405  Map output records=1  Map output bytes=17  Map output materialized bytes=55  Input split bytes=657  Combine input records=1  Combine output records=1  Reduce input groups=1  Reduce shuffle bytes=55  Reduce input records=1  Reduce output records=1  Spilled Records=2  Shuffled Maps =6  Failed Shuffles=0  Merged Map outputs=6  GC time elapsed (ms)=2359  CPU time spent (ms)=0  Physical memory (bytes) snapshot=0  Virtual memory (bytes) snapshot=0  Total committed heap usage (bytes)=1106096128 Shuffle Errors  BAD_ID=0  CONNECTION=0  IO_ERROR=0  WRONG_LENGTH=0  WRONG_MAP=0  WRONG_REDUCE=0 File Input Format Counters   Bytes Read=15649 File Output Format Counters   Bytes Written=12314/10/08 05:57:51 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized14/10/08 05:57:51 WARN mapreduce.JobSubmitter: No job jar file set.  User classes may not be found. See Job or Job#setJar(String).14/10/08 05:57:51 INFO input.FileInputFormat: Total input paths to process : 114/10/08 05:57:51 INFO mapreduce.JobSubmitter: number of splits:114/10/08 05:57:51 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local571678604_000214/10/08 05:57:51 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/staging/root571678604/.staging/job_local571678604_0002/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.14/10/08 05:57:51 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/staging/root571678604/.staging/job_local571678604_0002/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.14/10/08 05:57:52 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/local/localRunner/root/job_local571678604_0002/job_local571678604_0002.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.14/10/08 05:57:52 WARN conf.Configuration: file:/tmp/hadoop-root/mapred/local/localRunner/root/job_local571678604_0002/job_local571678604_0002.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.14/10/08 05:57:52 INFO mapreduce.Job: The url to track the job: http://localhost:8080/14/10/08 05:57:52 INFO mapreduce.Job: Running job: job_local571678604_000214/10/08 05:57:52 INFO mapred.LocalJobRunner: OutputCommitter set in config null14/10/08 05:57:52 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter14/10/08 05:57:52 INFO mapred.LocalJobRunner: Waiting for map tasks14/10/08 05:57:52 INFO mapred.LocalJobRunner: Starting task: attempt_local571678604_0002_m_000000_014/10/08 05:57:52 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]14/10/08 05:57:52 INFO mapred.MapTask: Processing split: file:/opt/hadoop-2.5.1/grep-temp-913340630/part-r-00000:0+11114/10/08 05:57:52 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer14/10/08 05:57:52 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)14/10/08 05:57:52 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 10014/10/08 05:57:52 INFO mapred.MapTask: soft limit at 8388608014/10/08 05:57:52 INFO mapred.MapTask: bufstart = 0; bufvoid = 10485760014/10/08 05:57:52 INFO mapred.MapTask: kvstart = 26214396; length = 655360014/10/08 05:57:52 INFO mapred.LocalJobRunner: 14/10/08 05:57:52 INFO mapred.MapTask: Starting flush of map output14/10/08 05:57:52 INFO mapred.MapTask: Spilling map output14/10/08 05:57:52 INFO mapred.MapTask: bufstart = 0; bufend = 17; bufvoid = 10485760014/10/08 05:57:52 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214396(104857584); length = 1/655360014/10/08 05:57:52 INFO mapred.MapTask: Finished spill 014/10/08 05:57:52 INFO mapred.Task: Task:attempt_local571678604_0002_m_000000_0 is done. And is in the process of committing14/10/08 05:57:52 INFO mapred.LocalJobRunner: map14/10/08 05:57:52 INFO mapred.Task: Task 'attempt_local571678604_0002_m_000000_0' done.14/10/08 05:57:52 INFO mapred.LocalJobRunner: Finishing task: attempt_local571678604_0002_m_000000_014/10/08 05:57:52 INFO mapred.LocalJobRunner: map task executor complete.14/10/08 05:57:52 INFO mapred.LocalJobRunner: Waiting for reduce tasks14/10/08 05:57:52 INFO mapred.LocalJobRunner: Starting task: attempt_local571678604_0002_r_000000_014/10/08 05:57:52 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]14/10/08 05:57:52 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@63ae8b5c14/10/08 05:57:52 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=363285696, maxSingleShuffleLimit=90821424, mergeThreshold=239768576, ioSortFactor=10, memToMemMergeOutputsThreshold=1014/10/08 05:57:52 INFO reduce.EventFetcher: attempt_local571678604_0002_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events14/10/08 05:57:52 INFO reduce.LocalFetcher: localfetcher#2 about to shuffle output of map attempt_local571678604_0002_m_000000_0 decomp: 21 len: 25 to MEMORY14/10/08 05:57:52 INFO reduce.InMemoryMapOutput: Read 21 bytes from map-output for attempt_local571678604_0002_m_000000_014/10/08 05:57:52 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 21, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->2114/10/08 05:57:52 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning14/10/08 05:57:52 INFO mapred.LocalJobRunner: 1 / 1 copied.14/10/08 05:57:52 INFO reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs14/10/08 05:57:52 INFO mapred.Merger: Merging 1 sorted segments14/10/08 05:57:52 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 11 bytes14/10/08 05:57:52 INFO reduce.MergeManagerImpl: Merged 1 segments, 21 bytes to disk to satisfy reduce memory limit14/10/08 05:57:52 INFO reduce.MergeManagerImpl: Merging 1 files, 25 bytes from disk14/10/08 05:57:52 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce14/10/08 05:57:52 INFO mapred.Merger: Merging 1 sorted segments14/10/08 05:57:52 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 11 bytes14/10/08 05:57:52 INFO mapred.LocalJobRunner: 1 / 1 copied.14/10/08 05:57:52 INFO mapred.Task: Task:attempt_local571678604_0002_r_000000_0 is done. And is in the process of committing14/10/08 05:57:52 INFO mapred.LocalJobRunner: 1 / 1 copied.14/10/08 05:57:52 INFO mapred.Task: Task attempt_local571678604_0002_r_000000_0 is allowed to commit now14/10/08 05:57:52 INFO output.FileOutputCommitter: Saved output of task 'attempt_local571678604_0002_r_000000_0' to file:/opt/hadoop-2.5.1/output/_temporary/0/task_local571678604_0002_r_00000014/10/08 05:57:52 INFO mapred.LocalJobRunner: reduce > reduce14/10/08 05:57:52 INFO mapred.Task: Task 'attempt_local571678604_0002_r_000000_0' done.14/10/08 05:57:52 INFO mapred.LocalJobRunner: Finishing task: attempt_local571678604_0002_r_000000_014/10/08 05:57:52 INFO mapred.LocalJobRunner: reduce task executor complete.14/10/08 05:57:53 INFO mapreduce.Job: Job job_local571678604_0002 running in uber mode : false14/10/08 05:57:53 INFO mapreduce.Job:  map 100% reduce 100%14/10/08 05:57:53 INFO mapreduce.Job: Job job_local571678604_0002 completed successfully14/10/08 05:57:53 INFO mapreduce.Job: Counters: 33 File System Counters  FILE: Number of bytes read=39892  FILE: Number of bytes written=913502  FILE: Number of read operations=0  FILE: Number of large read operations=0  FILE: Number of write operations=0 Map-Reduce Framework  Map input records=1  Map output records=1  Map output bytes=17  Map output materialized bytes=25  Input split bytes=120  Combine input records=0  Combine output records=0  Reduce input groups=1  Reduce shuffle bytes=25  Reduce input records=1  Reduce output records=1  Spilled Records=2  Shuffled Maps =1  Failed Shuffles=0  Merged Map outputs=1  GC time elapsed (ms)=37  CPU time spent (ms)=0  Physical memory (bytes) snapshot=0  Virtual memory (bytes) snapshot=0  Total committed heap usage (bytes)=250560512 Shuffle Errors  BAD_ID=0  CONNECTION=0  IO_ERROR=0  WRONG_LENGTH=0  WRONG_MAP=0  WRONG_REDUCE=0 File Input Format Counters   Bytes Read=123 File Output Format Counters   Bytes Written=23
OK,总算对了。           

给我老师的人工智能教程打call!http://blog.csdn.net/jiangjunshow
这里写图片描述
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
### 回答1: 安装Hadoop集群节点,请参考以下步骤: 1. 下载Hadoop安装包并解压缩。 2. 设置Java环境变量。 3. 配置Hadoop的环境变量,并设置Hadoop的工作目录。 4. 修改Hadoop的配置文件,主要是core-site.xml、hdfs-site.xml、mapred-site.xml和yarn-site.xml。在其中配置Hadoop的文件系统、数据节点、任务节点和资源管理器等。 5. 启动Hadoop服务,并检查是否正常运行。可以使用jps命令来检查是否启动了NameNode、SecondaryNameNode、DataNode和ResourceManager等服务。 6. 节点集群搭建,需要在Hadoop上创建一个数据存储目录,并在其中创建一个数据存储空间。 7. 测试节点集群。可以使用Hadoop自带的样例程序wordcount来测试Hadoop是否正常运行。 以上是安装Hadoop集群节点的基本步骤,具体操作可以参考相关资料或者Hadoop官方文档。 ### 回答2: Hadoop是一个分布式计算框架,用于处理大规模数据集。在Linux系统上安装Hadoop需要执行以下步骤: 1.下载Hadoop:在Apache Hadoop的官网上下载最新版本的Hadoop安装包。 2.安装Java:因为Hadoop是用Java编写的,所以需要保证在系统上安装了Java。使用以下命令来检查Java环境: $ java -version 3.解压Hadoop安装包:解压下载的Hadoop安装包,并将其移动到所需的位置。 4.配置环境变量:为了访问Hadoop命令,需要将Hadoop的bin目录添加到系统的PATH环境变量中。使用以下命令来配置环境变量: $ export PATH=$PATH:/path/to/hadoop/bin/ 或者将这行代码添加到~/.bash_profile文件中,以便每次使用终端窗口时自动设置环境变量。 5.配置HadoopHadoop的配置文件位于Hadoop的安装目录中,其中包含了必要的配置。主要有三个配置文件:core-site.xml,hdfs-site.xml和mapred-site.xml。 6.启动Hadoop:在节点环境中,可以使用以下命令启动Hadoop: $ hadoop namenode -format $ start-all.sh 然后在浏览器中访问http://localhost:50070来验证Hadoop是否已经运行。 如果需要安装Hadoop集群,需要在多台Linux主机上进行相同的配置和安装。其中最重要的是,需要注意Hadoop集群中每台机器的主机名、IP地址以及ssh连接。需要保证每台机器都可以通过ssh互相访问,因为Hadoop的多节点环境需要使用ssh进行通信。另外,需要在每台机器上配置相同的Hadoop环境变量,并确保每个节点都有相同的配置文件。在集群环境中,启动Hadoop的命令不同,需要使用start-dfs.sh和start-mapred.sh。使用以下命令分别在每台机器上启动Hadoop: $ start-dfs.sh $ start-mapred.sh 最后,在浏览器中访问任一节点的http://<hostname>:50070,以验证Hadoop是否已经在集群中运行。 ### 回答3: Hadoop是一个基于Java的开源分布式计算框架,用于处理大规模数据集。为了在Linux系统上安装Hadoop和配置节点集群,需要进行如下步骤: 1. 安装Java SDK Hadoop需要Java的支持,因此首先需要安装Java SDK。可以通过以下命令在Ubuntu上安装Oracle Java 8: sudo add-apt-repository ppa:webupd8team/java sudo apt-get update sudo apt-get install oracle-java8-installer 2. Hadoop下载及解压 从Hadoop官网(http://hadoop.apache.org/)下载最新版本的Hadoop,然后解压到指定目录,例如:/usr/local/hadoop。 3. 配置Hadoop环境变量 打开~/.bashrc文件,添加以下三行: export HADOOP_HOME=/usr/local/hadoop export PATH=$PATH:$HADOOP_HOME/bin export PATH=$PATH:$HADOOP_HOME/sbin 然后执行以下命令使环境变量生效: source ~/.bashrc 4. 配置Hadoop节点 进入到$HADOOP_HOME/etc/hadoop目录,打开hadoop-env.sh文件并编辑以下两行内容: export JAVA_HOME=/usr/lib/jvm/java-8-oracle export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native" 然后在同一目录中创建core-site.xml文件,将以下内容添加到该文件中: <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property> </configuration> 然后创建hdfs-site.xml文件,将以下内容添加到该文件中: <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/usr/local/hadoop/hadoop_data/hdfs/namenode</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/usr/local/hadoop/hadoop_data/hdfs/datanode</value> </property> </configuration> 现在在$HADOOP_HOME/sbin目录中执行以下命令启动Hadoop: start-dfs.sh start-yarn.sh 5. 测试Hadoop 执行jps命令,如果输出以下内容,则表示Hadoop已经正常启动: 2054 NameNode 2321 Jps 2223 NodeManager 2112 SecondaryNameNode 2170 ResourceManager 1994 DataNode 现在可以使用Hadoop自带的命令进行测试,例如创建一个HDFS目录并上传一个文件: hdfs dfs -mkdir /test echo "Hello World" > test.txt hdfs dfs -put test.txt /test 最后,执行以下命令停止Hadoop: stop-yarn.sh stop-dfs.sh 以上就是在Linux系统上安装Hadoop及配置节点集群的详细步骤。注意,在实际的生产环境中,需要根据需要进行更加详细的配置和优化。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值