Hadoop TaskTracker 自我检测机制
机制:通过TaskTracker在启动时循环检测,设计时通过捕获异常来完成状态检测
|-->TaskTracker.run()
|-->initialize();
|-->if (shouldStartHealthMonitor(this.fConf))
|-->startHealthMonitor(this.fConf);
|-->healthChecker = new NodeHealthCheckerService(conf);
|-->initialize(conf);
|-->this.nodeHealthScript = conf.get(HEALTH_CHECK_SCRIPT_PROPERTY); |mapred.healthChecker.script.path
|-->timer = new NodeHealthMonitorExecutor(args); |初始化执行器
|-->healthChecker.start();
|-->if (!shouldRun(conf)) |判断是否存在自我检测
|-->nodeHealthScriptScheduler = new Timer("NodeHealthMonitor-Timer", true);
|-->nodeHealthScriptScheduler.scheduleAtFixedRate(timer, 0, intervalTime); |执行如下
|-->NodeHealthMonitorExecutor
|-->construct() |构造器
|-->execScript.add(nodeHealthScript);
|-->shexec = new ShellCommandExecutor((String[]) execScript.toArray(new String[execScript.size()]), null, null, scriptTimeout);
|-->run() |Executor线程执行
|-->HealthCheckerExitStatus status = HealthCheckerExitStatus.SUCCESS; |如无异常则表示状态正常
|-->shexec.execute(); |定时器执行健康检查脚本
|-->Exception |设计时以Exception的处理确定检查状态
|-->reportHealthStatus(status);
机制:通过TaskTracker在启动时循环检测,设计时通过捕获异常来完成状态检测
|-->TaskTracker.run()
|-->initialize();
|-->if (shouldStartHealthMonitor(this.fConf))
|-->startHealthMonitor(this.fConf);
|-->healthChecker = new NodeHealthCheckerService(conf);
|-->initialize(conf);
|-->this.nodeHealthScript = conf.get(HEALTH_CHECK_SCRIPT_PROPERTY); |mapred.healthChecker.script.path
|-->timer = new NodeHealthMonitorExecutor(args); |初始化执行器
|-->healthChecker.start();
|-->if (!shouldRun(conf)) |判断是否存在自我检测
|-->nodeHealthScriptScheduler = new Timer("NodeHealthMonitor-Timer", true);
|-->nodeHealthScriptScheduler.scheduleAtFixedRate(timer, 0, intervalTime); |执行如下
|-->NodeHealthMonitorExecutor
|-->construct() |构造器
|-->execScript.add(nodeHealthScript);
|-->shexec = new ShellCommandExecutor((String[]) execScript.toArray(new String[execScript.size()]), null, null, scriptTimeout);
|-->run() |Executor线程执行
|-->HealthCheckerExitStatus status = HealthCheckerExitStatus.SUCCESS; |如无异常则表示状态正常
|-->shexec.execute(); |定时器执行健康检查脚本
|-->Exception |设计时以Exception的处理确定检查状态
|-->reportHealthStatus(status);