在使用hadoop2.2.0 的 fairscheduler的时候,出现了下面的一个问题:
当多个客户端提交任务的时候,发现生成的appatempt 没有进入fairscheduler的 eventQueue,导致fairscheduler没有对该任务进行调度,而当am向scheduler请求这个作业的信息时,出现下面的问题,而且是打了很多这样的log:
2013-11-27 14:27:02,258 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Request for appInfo of unknown attemptappattempt_1384743376038_1122_000001
2013-11-27 14:27:02,258 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Request for appInfo of unknown attemptappattempt_1384743376038_1121_000001
仔细查找log中的蛛丝马迹,发现没有出现调度器调度的log:
正常作业调度的log记录:
2013-11-27 14:25:36,515 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application with id 1120 submitted by user root
2013-11-27 14:25:36,515 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing application with id application_1384743376038_1120
2013-11-27 14:25:36,515 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root IP=192.168.24.101 OPERATION=Submit Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1384743376038_1120
2013-11-27 14:25:36,515 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1384743376038_1120 State change from NEW to NEW_SAVING
2013-11-27 14:25:36,515 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Storing info for app: application_1384743376038_1120
2013-11-27 14:25:36,516 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.R