网上有很多mapreduce的源码解析以及执行过程,这里从相关的日志角度看mapreduce的执行过程,以便于在通过日志分析定位分析问题
先简单说一下mapreduce的执行过程,如下图:
1、应用程序客户端client准备作业(数据split划分,作业配置等),并上传值HDFS中;并提交作业至JobTracker。
2、JobTracker根据job的配置信息(jobconf,split等),创建Tasks,并放到任务队列中,等待调度,JobTracker会启动JobInProcess跟踪job的执行过程。
3、TaskTracker通过心跳到JobTracker获取相关的task,TaskTracker首先进行JOB_SETUP会从HDFS中获取执行的jar,
然后对于获取的每一个task开启一个taskinprocess跟踪task的执行过程,TaskTracker会启动一个JVM执行map、reduce的相关逻辑。
下面结合JobTracker和TaskTracker的日志进行分析关键执行路径
hadoop-root-jobtracker-xxxxxx.log
#表示这个job包含三个map和1个reduce任务,jobid为job_201212191009_0012
2012-12-19 13:46:40,478 INFO org.apache.hadoop.mapred.JobInProgress: job_201212191009_0012: nMaps=3 nReduces=1 max=-1
2012-12-19 13:46:40,479 INFO org.apache.hadoop.mapred.JobTracker: Job job_201212191009_0012 added successfully for user 'root' to queue 'default'
2012-12-19 13:46:40,479 INFO org.apache.hadoop.mapred.AuditLogger: USER=root IP=10.1.1.28 OPERATION=SUBMIT_JOB
TARGET=job_201212191009_0012 RESULT=SUCCESS
#job的初始化,根据job的相关的配置信息(根据件job.jar,job.split,和job.xml,生成三个map task,一个reduce task,放到调度队列中)
2012-12-19 13:46:40,479 INFO org.apache.hadoop.mapred.JobTracker: Initializing job_201212191009_0012
2012-12-19 13:46:40,479 INFO org.apache.hadoop.mapred.JobInProgress: Initializing job_201212191009_0012
2012-12-19 13:46:40,525 INFO org.apache.hadoop.mapred.JobInProgress: jobToken generated and stored with users keys
in /var/hadoop2/hadoop-tmp/mapred/system/job_201212191009_0012/jobToken
2012-12-19 13:46:40,529 INFO org.apache.hadoop.mapred.JobInProgress: Input size for job job_201212191009_0012 = 525769586. Number of splits = 3
2012-12-19 13:46:40,529 INFO org.apache.hadoop.mapred.JobInProgress: tip:task_201212191009_0012_m_000000has split on node:/default-rack/wxlab29
2012-12-19 13:46:40,529 INFO org.apache.hadoop.mapred.JobInProgress: tip:task_201212191009_0012_m_000001has split on node:/default-rack/wxlab29
2012-12-19 13:46:40,530 INFO org.apache.hadoop.mapred.JobInProgress: tip:task_201212191009_0012_m_000002has split on node:/default-rack/wxlab29
2012-12-19 13:46:40,530 INFO org.apache.hadoop.mapred.JobInProgress: job_201212191009_0012 LOCALITY_WAIT_FACTOR=1.0
2012-12-19 13:46:40,530 INFO org.apache.hadoop.mapred.JobInProgress: Job job_201212191009_0012 initialized successfully with 3 map tasks and 1 reduce tasks.
#增加一个JOB_SETUP的task给tasktracker节点做job的设置,比如从HDFS中获取执行的jar文件等,每个tasktracker在执行一个job开始时都必须进行JOB_SETUP
2012-12-19 13:46:40,831 INFO org.apache.hadoop.mapred.JobTracker: Adding task (JOB_SETUP) 'attempt_201212191009_0012_m_000004_0'
to tiptask_201212191009_0012_m_000004,
for tracker 'tracker_wxlab29:localhost.localdomain/127.0.0.1:28696'
2012-12-19 13:46:49,838 INFO org.apache.hadoop.mapred.JobInProgress:
Task 'attempt_201212191009_0012_m_000004_0' has completed task_201212191009_0012_m_000004 successfully.
#tasktraker心跳到jobtracker后,获取分配的task;hadoop默认情况下,每个tasktracker只有2个slots,所以只能同时分配2个map task给该taskNode执行
2012-12-19 13:46:49,840 INFO org.apache.hadoop.mapred.JobTracker: Adding task (MAP)
'attempt_201212191009_0012_m_000000_0' to tip task_201212191009_0012_m_000000, for tracker 'tracker_wxlab29:localhost.localdomain/127.0.0.1:28696'
2012-12-19 13:46:49,840 INFO org.apache.hadoop.mapred.JobInProgress: Choosing data-local task task_201212191009_0012_m_000000
2012-12-19 13:46:49,841 INFO org.apache.hadoop.mapred.JobTracker: Adding task (MAP)
'attempt_201212191009_0012_m_000001_0' to tiptask_201212191009_0012_m_000001, for tracker 'tracker_wxlab29:localhost.localdomain/127.0.0.1:28696'
2012-12-19 13:46:49,841 INFO org.apache.hadoop.mapred.JobInProgress: Choosing data-local task task_201212191009_0012_m_000001
#执行完成一个task后,空余一个slot,分配下一map task给该节点执行
2012-12-19 13:47:04,850 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201212191009_0012_m_000001_0'
has completedtask_201212191009_0012_m_000001 successfully.
2012-12-19 13:47:04,852 INFO org.apache.hadoop.mapred.JobTracker: Adding task (MAP) 'attempt_201212191009_0012_m_000002_0'
to tiptask_201212191009_0012_m_000002, for tracker 'tracker_wxlab29:localhost.localdomain/127.0.0.1:28696'
2012-12-19 13:47:04,852 INFO org.apache.hadoop.mapred.JobInProgress: Choosing data-local task task_201212191009_0012_m_000002
#taskNode map任务执行完成后,执行reduce task
2012-12-19 13:47:04,852 INFO org.apache.hadoop.mapred.JobTracker: Adding task (REDUCE) 'attempt_201212191009_0012_r_000000_0'
to tiptask_201212191009_0012_r_000000, for tracker 'tracker_wxlab29:localhost.localdomain/127.0.0.1:28696'
2012-12-19 13:47:07,856 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201212191009_0012_m_000000_0'
has completedtask_201212191009_0012_m_000000 successfully.
2012-12-19 13:47:10,860 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201212191009_0012_m_000002_0'
has completedtask_201212191009_0012_m_000002 successfully.
2012-12-19 13:47:22,867 INFO org.apache.hadoop.mapred.JobInProgress: Tas