一. 前言
用于在底层操作系统上启动container的机制的抽象类。 所有的executor 必须继承ContainerExecutor.
ContainerExecutor可与底层操作系统交互, 安全存放Container需要的文件和目录, 进而以一种安全的方式启动和清除Container对应的进程。 目前,YARN提供了DefaultContainerExecutor和LinuxContainerExecutor两种实现。 其中,DefaultContainerExecutor是默认实现, 未提供任何权安全措施, 它以NodeManager启动者的身份启动和停止Container; 而LinuxContainerExecutor则以应用程序拥有者的身份启动和停止Container, 因此更加安全, 此外, LinuxContainerExecutor允许用户通过Cgroups对CPU资源进行隔离。
二. 属性
// 通配符
protected static final String WILDCARD = "*";
/**
* 创建启动脚本时要使用的权限。 : 700
* The permissions to use when creating the launch script.
*/
public static final FsPermission TASK_LAUNCH_SCRIPT_PERMISSION = FsPermission.createImmutable((short)0700);
/**
*
* 调试信息将写入的相对路径。
*
* The relative path to which debug information will be written.
*
* @see ShellScriptBuilder#listDebugInformation
*/
public static final String DIRECTORY_CONTENTS = "directory.info";
// 配置信息
private Configuration conf;
// ContainerId 对应的 pid 存储文件路径
private final ConcurrentMap<ContainerId, Path> pidFiles = new ConcurrentHashMap<>();
// 可重入读写锁
private final ReentrantReadWriteLock lock = new ReentrantReadWriteLock();
private final ReadLock readLock = lock.readLock();
private final WriteLock writeLock = lock.writeLock();
// 白名单 变量 : 用户可以自定义设置的环境变量, 当用户指定的时候, 不再使用NodeManager环境的默认值.
private String[] whitelistVars;
三.ExitCode
枚举类, 容器的退出code 码值
状态 | 码值 | 描述 |
---|---|---|
SUCCESS | 0 | 成功 |
FORCE_KILLED | 137 | 强制kill |
TERMINATED | 143 | 结束 |
LOST | 154 | 丢失 |
四.Signal
枚举类,信号的常数。
状态 | 码值 | 信号字符 | 描述 |
---|---|---|---|
NULL | 0 | NULL | 无操作 |
QUIT | 3 | SIGQUIT | 退出 |
KILL | 9 | SIGKILL | KILL操作 |
TERM | 15 | SIGTERM | 结束 |
五.DelayedProcessKiller
继承Thread 类. 此类将在指定的延迟后向目标容器发送信号。
5.1. 属性
// 容器
private final Container container;
// 用户
private final String user;
// 进程ID
private final String pid;
// 延迟时长 单位: 毫秒
private final long delay;
// 信号
private final Signal signal;
// 容器的 Executor
private final ContainerExecutor containerExecutor;
5.2. 构造函数
public DelayedProcessKiller(Container container, String user, String pid,
long delayMS, Signal signal, ContainerExecutor containerExecutor) {
this.container = container;
this.user = user;
this.pid = pid;
this.delay = delayMS;
this.signal = signal;
this.containerExecutor = containerExecutor;
setName("Task killer for " + pid);
setDaemon(false);
}
5.3. run 函数
@Override
public void run() {
try {
// 休眠指定时长 : 单位 毫秒
Thread.sleep(delay);
// 发送信号
containerExecutor.signalContainer(new ContainerSignalContext.Builder()
.setContainer(container)
.setUser(user)
.setPid(pid)
.setSignal(signal)
.build());
} catch (InterruptedException e) {
interrupt();
} catch (IOException e) {
String message = "Exception when user " + user + " killing task " + pid
+ " in DelayedProcessKiller: " + StringUtils.stringifyException(e);
LOG.warn(message);
// 处理容器状态的变更
container.handle(new ContainerDiagnosticsUpdateEvent(
container.getContainerId(), message));
}
}
六. 方法集锦
6.1. setConf
- 赋值conf配置.
- 加载白名单列表 :
AVA_HOME , HADOOP_COMMON_HOME , HADOOP_HDFS_HOME , HADOOP_CONF_DIR , CLASSPATH_PREPEND_DISTCACHE , HADOOP_YARN_HOME
@Override
public void setConf(Configuration conf) {
this.conf = conf;
if (conf != null) {
// Environment variables that containers may override rather than use NodeManager's default
//
// 用户可以自定义设置的环境变量, 当用户指定的时候, 不再使用NodeManager环境的默认值.
// yarn.nodemanager.env-whitelist
// JAVA_HOME , HADOOP_COMMON_HOME , HADOOP_HDFS_HOME , HADOOP_CONF_DIR , CLASSPATH_PREPEND_DISTCACHE , HADOOP_YARN_HOME
whitelistVars = conf.get(YarnConfiguration.NM_ENV_WHITELIST,
YarnConfiguration.DEFAULT_NM_ENV_WHITELIST).split(",");
}
}
6.2. init
运行执行器初始化步骤。
验证必要的配置和权限是否到位。
6.3. localizeClasspathJar ?
此函数用于按需本地化JAR文件。
在Windows上,ContainerLaunch创建其他JAR的临时特殊JAR清单,以解决类路径长度问题。
在安全集群中,这个JAR必须本地化,以便容器可以访问它。
默认实现返回传递给它的类路径,该类路径应该是在节点管理器的fprivate文件夹中创建的,该文件夹不适用于安全的Windows群。
6.4. startLocalizer
为此应用程序中的容器准备要执行的环境。
6.5. prepareContainer
在编写启动环境之前准备容器。
6.6. launchContainer
启动节点上的容器。这是一个阻塞调用,仅在容器退出时返回
6.7. relaunchContainer
重新启动节点上的容器。这是一个阻塞调用,仅在容器退出时返回。
6.8. signalContainer
向Container 发送信号
6.9. symLink
创建一个指向目标的符号链接文件。
6.10. isContainerAlive
检查Container 是否存活.
6.11. reacquireContainer
恢复已经存在的Container
这是一个阻塞调用,仅在容器退出时返回。
请注意,容器必须在此调用之前激活。
/**
* Recover an already existing container. This is a blocking call and returns
* only when the container exits. Note that the container must have been
* activated prior to this call.
*
* @param ctx encapsulates information necessary to reacquire container
* @return The exit code of the pre-existing container
* @throws IOException if there is a failure while reacquiring the container
* @throws InterruptedException if interrupted while waiting to reacquire
* the container
*/
public int reacquireContainer(ContainerReacquisitionContext ctx)
throws IOException, InterruptedException {
// 获取容器
Container container = ctx.getContainer();
// 获取用户
String user = ctx.getUser();
// 获取ContainerId
ContainerId containerId = ctx.getContainerId();
// 获取pid 路径
Path pidPath = getPidFilePath(containerId);
if (pidPath == null) {
LOG.warn(containerId + " is not active, returning terminated error");
return ExitCode.TERMINATED.getExitCode();
}
// 获取pid
String pid = ProcessIdFileReader.getProcessId(pidPath);
if (pid == null) {
throw new IOException("Unable to determine pid for " + containerId);
}
LOG.info("Reacquiring " + containerId + " with pid " + pid);
// 构建ContainerLivenessContext
ContainerLivenessContext livenessContext = new ContainerLivenessContext
.Builder()
.setContainer(container)
.setUser(user)
.setPid(pid)
.build();
while (isContainerAlive(livenessContext)) {
Thread.sleep(1000);
}
// wait for exit code file to appear
final int sleepMsec = 100;
int msecLeft = 2000;
// 获取pid进程文件路径
String exitCodeFile = ContainerLaunch.getExitCodeFile(pidPath.toString());
File file = new File(exitCodeFile);
while (!file.exists() && msecLeft >= 0) {
if (!isContainerActive(containerId)) {
LOG.info(containerId + " was deactivated");
return ExitCode.TERMINATED.getExitCode();
}
Thread.sleep(sleepMsec);
msecLeft -= sleepMsec;
}
if (msecLeft < 0) {
throw new IOException("Timeout while waiting for exit code from "
+ containerId);
}
try {
return Integer.parseInt(
FileUtils.readFileToString(file, Charset.defaultCharset()).trim());
} catch (NumberFormatException e) {
throw new IOException("Error parsing exit code from pid " + pid, e);
}
}
6.12. writeLaunchEnv
此方法将容器的启动环境写出到指定的路径。
- 操作步骤
1.输出脚本的头信息
2.设置快速失败并检查退出状态(exit codes).
3.输出运行日志
4.输出异常日志
5.设置环境变量
6.设置资源文件[通过ln -sf 构建所需jar文件/配置文件的软连接. ]
7.设置debug 信息
8.确定目录内容
9.输出启动脚本
/**
* 此方法将容器的启动环境写出到指定的路径。
* This method writes out the launch environment of a container to a specified path.
*
* @param out the output stream to which the environment is written (usually
* a script file which will be executed by the Launcher)
* @param environment the environment variables and their values
* @param resources the resources which have been localized for this
* container. Symlinks will be created to these localized resources
* @param command the command that will be run
* @param logDir the log dir to which to copy debugging information
* @param user the username of the job owner
* @param outFilename the path to which to write the launch environment
* @param nmVars the set of environment vars that are explicitly set by NM
* @throws IOException if any errors happened writing to the OutputStream,
* while creating symlinks
*/
@VisibleForTesting
public void writeLaunchEnv(OutputStream out, Map<String, String> environment,
Map<Path, List<String>> resources, List<String> command, Path logDir,
String user, String outFilename, LinkedHashSet<String> nmVars)
throws IOException {
ContainerLaunch.ShellScriptBuilder sb = ContainerLaunch.ShellScriptBuilder.create();
// # 输出脚本的头信息
// #!/bin/bash
// Add "set -o pipefail -e" to validate launch_container script.
sb.setExitOnFailure();
// #快速失败并检查退出状态(exit codes).
// set -o pipefail -e
//Redirect stdout and stderr for launch_container script
sb.stdout(logDir, CONTAINER_PRE_LAUNCH_STDOUT);
// #输出运行日志
// export PRELAUNCH_OUT="/opt/tools/hadoop-3.2.1/logs/userlogs/application_1611681788558_0001/container_1611681788558_0001_01_000001/prelaunch.out"
// exec >"${PRELAUNCH_OUT}"
sb.stderr(logDir, CONTAINER_PRE_LAUNCH_STDERR);
// #输出异常日志
// export PRELAUNCH_ERR="/opt/tools/hadoop-3.2.1/logs/userlogs/application_1611681788558_0001/container_1611681788558_0001_01_000001/prelaunch.err"
// exec 2>"${PRELAUNCH_ERR}"
if (environment != null) {
sb.echo("Setting up env variables");
// # 设置环境变量
// echo "Setting up env variables"
// 白名单环境变量被特别处理。
// 仅当环境中尚未定义它们时才添加它们。
// 使用特殊的语法添加它们,以防止它们掩盖可能在容器映像(例如docker映像)中显式设置的变量。
// 将这些放在其他之前,以确保使用正确的使用。
// Whitelist environment variables are treated specially.
// Only add them if they are not already defined in the environment.
// Add them using special syntax to prevent them from eclipsing variables that may be set explicitly in the container image (e.g, in a docker image).
// Put these before the others to ensure the correct expansion is used.
sb.echo("Setting up env variables#whitelistVars");
for(String var : whitelistVars) {
if (!environment.containsKey(var)) {
String val = getNMEnvVar(var);
if (val != null) {
sb.whitelistedEnv(var, val);
}
}
}
// # 设置环境变量#白名单变量
// echo "Setting up env variables#whitelistVars"
// export JAVA_HOME=${JAVA_HOME:-"/Library/java/JavaVirtualMachines/jdk1.8.0_271.jdk/Contents/Home"}
// export HADOOP_COMMON_HOME=${HADOOP_COMMON_HOME:-"/opt/workspace/apache/hadoop-3.2.1-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/../../../../hadoop-common-project/hadoop-common/target"}
// export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/opt/tools/hadoop-3.2.1/etc/hadoop"}
// export HADOOP_HOME=${HADOOP_HOME:-"/opt/workspace/apache/hadoop-3.2.1-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/../../../../hadoop-common-project/hadoop-common/target"}
// export PATH=${PATH:-"/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/tools/apache-maven-3.6.3/bin:/opt/tools/scala-2.12.10/bin:/usr/local/mysql-5.7.28-macos10.14-x86_64/bin:/Library/java/JavaVirtualMachines/jdk1.8.0_271.jdk/Contents/Home/bin:/opt/tools/hadoop-3.2.1/bin:/opt/tools/hadoop-3.2.1/etc/hadoop:henghe:/opt/tools/ozone-1.0.0/bin:/opt/tools/spark-2.4.5/bin:/opt/tools/spark-2.4.5/conf:/opt/tools/redis-5.0.7/src:/opt/tools/datax/bin:/opt/tools/apache-ant-1.9.6/bin:/opt/tools/hbase-2.0.2/bin"}
sb.echo("Setting up env variables#env");
// 现在编写由nodemanager显式设置的变量,保留它们的写入顺序。
// Now write vars that were set explicitly by nodemanager, preserving the order they were written in.
for (String nmEnvVar : nmVars) {
sb.env(nmEnvVar, environment.get(nmEnvVar));
}
// # 设置环境变量#环境变量
// echo "Setting up env variables#env"
// export HADOOP_TOKEN_FILE_LOCATION="/opt/tools/hadoop-3.2.1/local-dirs/usercache/henghe/appcache/application_1611681788558_0001/container_1611681788558_0001_01_000001/container_tokens"
// export CONTAINER_ID="container_1611681788558_0001_01_000001"
// export NM_PORT="62016"
// export NM_HOST="boyi-pro.lan"
// export NM_HTTP_PORT="8042"
// export LOCAL_DIRS="/opt/tools/hadoop-3.2.1/local-dirs/usercache/henghe/appcache/application_1611681788558_0001"
// export LOCAL_USER_DIRS="/opt/tools/hadoop-3.2.1/local-dirs/usercache/henghe/"
// export LOG_DIRS="/opt/tools/hadoop-3.2.1/logs/userlogs/application_1611681788558_0001/container_1611681788558_0001_01_000001"
// export USER="henghe"
// export LOGNAME="henghe"
// export HOME="/home/"
// export PWD="/opt/tools/hadoop-3.2.1/local-dirs/usercache/henghe/appcache/application_1611681788558_0001/container_1611681788558_0001_01_000001"
// export JVM_PID="$$"
// export MALLOC_ARENA_MAX="4"
sb.echo("Setting up env variables#remaining");
// 现在写入剩余的环境变量
// Now write the remaining environment variables.
for (Map.Entry<String, String> env : sb.orderEnvByDependencies(environment).entrySet()) {
if (!nmVars.contains(env.getKey())) {
sb.env(env.getKey(), env.getValue());
}
}
// # 设置环境变量#剩余环境变量
// echo "Setting up env variables#remaining"
// export SPARK_YARN_STAGING_DIR="hdfs://localhost:8020/user/henghe/.sparkStaging/application_1611681788558_0001"
// export APPLICATION_WEB_PROXY_BASE="/proxy/application_1611681788558_0001"
// export CLASSPATH="$PWD:$PWD/__spark_conf__:$PWD/__spark_libs__/*:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/share/hadoop/common/*:$HADOOP_COMMON_HOME/share/hadoop/common/lib/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*:$HADOOP_YARN_HOME/share/hadoop/yarn/*:$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*:$PWD/__spark_conf__/__hadoop_conf__"
// export APP_SUBMIT_TIME_ENV="1611681915166"
// export SPARK_USER="henghe"
// export PYTHONHASHSEED="0"
}
if (resources != null) {
sb.echo("Setting up job resources");
Map<Path, Path> symLinks = resolveSymLinks(resources, user);
for (Map.Entry<Path, Path> symLink : symLinks.entrySet()) {
// 链接环境变量
sb.symlink(symLink.getKey(), symLink.getValue());
}
// # 设置资源文件[通过ln -sf 构建所需jar文件/配置文件的软连接. ]
// echo "Setting up job resources"
// mkdir -p __spark_libs__
// ln -sf -- "/opt/tools/hadoop-3.2.1/local-dirs/usercache/henghe/filecache/35/spark-examples_2.11-2.4.5.jar" "__app__.jar"
// mkdir -p __spark_libs__
// ln -sf -- "/opt/tools/hadoop-3.2.1/local-dirs/usercache/henghe/filecache/180/__spark_conf__.zip" "__spark_conf__"
// # 此处省略N多
// # mkdir -p __spark_libs__
// # ln -sf --"xxxxx" "__spark_libs__/xxxxx.jar"
}
// dump 调试信息(如果已配置)
// dump debugging information if configured
if (getConf() != null && getConf().getBoolean(YarnConfiguration.NM_LOG_CONTAINER_DEBUG_INFO, YarnConfiguration.DEFAULT_NM_LOG_CONTAINER_DEBUG_INFO)) {
// # 设置debug 信息
sb.echo("Copying debugging information");
sb.copyDebugInformation(new Path(outFilename), new Path(logDir, outFilename));
sb.listDebugInformation(new Path(logDir, DIRECTORY_CONTENTS));
// # 设置debug 信息
// echo "Copying debugging information"
// # Creating copy of launch script
// cp "launch_container.sh" "/opt/tools/hadoop-3.2.1/logs/userlogs/application_1611681788558_0001/container_1611681788558_0001_01_000001/launch_container.sh"
// chmod 640 "/opt/tools/hadoop-3.2.1/logs/userlogs/application_1611681788558_0001/container_1611681788558_0001_01_000001/launch_container.sh"
// # 确定目录内容
// # Determining directory contents
// echo "ls -l:" 1>"/opt/tools/hadoop-3.2.1/logs/userlogs/application_1611681788558_0001/container_1611681788558_0001_01_000001/directory.info"
// ls -l 1>>"/opt/tools/hadoop-3.2.1/logs/userlogs/application_1611681788558_0001/container_1611681788558_0001_01_000001/directory.info"
// echo "find -L . -maxdepth 5 -ls:" 1>>"/opt/tools/hadoop-3.2.1/logs/userlogs/application_1611681788558_0001/container_1611681788558_0001_01_000001/directory.info"
// find -L . -maxdepth 5 -ls 1>>"/opt/tools/hadoop-3.2.1/logs/userlogs/application_1611681788558_0001/container_1611681788558_0001_01_000001/directory.info"
// echo "broken symlinks(find -L . -maxdepth 5 -type l -ls):" 1>>"/opt/tools/hadoop-3.2.1/logs/userlogs/application_1611681788558_0001/container_1611681788558_0001_01_000001/directory.info"
// find -L . -maxdepth 5 -type l -ls 1>>"/opt/tools/hadoop-3.2.1/logs/userlogs/application_1611681788558_0001/container_1611681788558_0001_01_000001/directory.info"
}
sb.echo("Launching container");
// echo "Launching container"
// 启动container
sb.command(command);
// #输出启动脚本
// exec /bin/bash -c "
// $JAVA_HOME/bin/java
// -server
// -Xmx1024m
// -Djava.io.tmpdir=$PWD/tmp
// -Dspark.yarn.app.container.log.dir=/opt/tools/hadoop-3.2.1/logs/userlogs/application_1611681788558_0001/container_1611681788558_0001_01_000001
// org.apache.spark.deploy.yarn.ApplicationMaster
// --class 'org.apache.spark.examples.SparkPi'
// --jar file:/opt/tools/spark-2.4.5/examples/jars/spark-examples_2.11-2.4.5.jar
// --arg '10'
// --properties-file $PWD/__spark_conf__/__spark_conf__.properties
// 1> /opt/tools/hadoop-3.2.1/logs/userlogs/application_1611681788558_0001/container_1611681788558_0001_01_000001/stdout
// 2> /opt/tools/hadoop-3.2.1/logs/userlogs/application_1611681788558_0001/container_1611681788558_0001_01_000001/stderr"
//最终输入内容
LOG.warn("ContainerExecutor#writeLaunchEnv : " + sb.toString());
PrintStream pout = null;
try {
pout = new PrintStream(out, false, "UTF-8");
sb.write(pout);
} finally {
if (out != null) {
out.close();
}
}
}
6.13. getRunCommand
获取运行的command命令
/**
* Return a command line to execute the given command in the OS shell.
* On Windows, the {code}groupId{code} parameter can be used to launch
* and associate the given GID with a process group. On
* non-Windows hosts, the {code}groupId{code} parameter is ignored.
*
* @param command the command to execute
* @param groupId the job owner's GID for Windows. On other operating systems
* it is ignored.
* @param userName the job owner's username for Windows. On other operating
* systems it is ignored.
* @param pidFile the path to the container's PID file on Windows. On other
* operating systems it is ignored.
* @param config the configuration
* @param resource on Windows this parameter controls memory and CPU limits.
* If null, no limits are set. On other operating systems it is ignored.
* @return the command line to execute
*/
protected String[] getRunCommand(String command, String groupId,
String userName, Path pidFile, Configuration config, Resource resource) {
if (Shell.WINDOWS) {
return getRunCommandForWindows(command, groupId, userName, pidFile,
config, resource);
} else {
return getRunCommandForOther(command, config);
}
}
6.14. getRunCommandForOther
/**
*
* 返回命令行以在OS shell中执行给定的命令。
*
* Return a command line to execute the given command in the OS shell.
*
* @param command the command to execute
* @param config the configuration
* @return the command line to execute
*/
protected String[] getRunCommandForOther(String command,
Configuration config) {
List<String> retCommand = new ArrayList<>();
boolean containerSchedPriorityIsSet = false;
// 0
int containerSchedPriorityAdjustment = YarnConfiguration.DEFAULT_NM_CONTAINER_EXECUTOR_SCHED_PRIORITY;
// 对容器操作系统调度优先级进行调整。
// 有效值可能因平台而异。
// 在Linux上,较高的值意味着运行容器的优先级低于NM。
// 指定的值是一个整数
// yarn.nodemanager.container-executor.os.sched.priority.adjustment : 0
if (config.get(YarnConfiguration.NM_CONTAINER_EXECUTOR_SCHED_PRIORITY) != null) {
containerSchedPriorityIsSet = true;
containerSchedPriorityAdjustment = config
.getInt(YarnConfiguration.NM_CONTAINER_EXECUTOR_SCHED_PRIORITY,
YarnConfiguration.DEFAULT_NM_CONTAINER_EXECUTOR_SCHED_PRIORITY);
}
if (containerSchedPriorityIsSet) {
// 设置优先级
retCommand.addAll(Arrays.asList("nice", "-n", Integer.toString(containerSchedPriorityAdjustment)));
}
retCommand.addAll(Arrays.asList("bash", command));
// 0 = "nice"
// 1 = "-n"
// 2 = "6"
// 3 = "bash"
// 4 = "/opt/tools/hadoop-3.2.1/local-dirs/usercache/henghe/appcache/application_1611769550924_0001/container_1611769550924_0001_01_000001/default_container_executor.sh"
String[] commands = retCommand.toArray(new String[retCommand.size()]);
return commands ;
}
6.15. isContainerActive
验证Container 是否处于执行状态…
6.16. getNMEnvVar
获取系统环境变量
@VisibleForTesting
protected String getNMEnvVar(String varname) {
return System.getenv(varname);
}
6.17. activateContainer
将 容器信息,加入缓存.
/**
* Mark the container as active.
*
* @param containerId the container ID
* @param pidFilePath the path where the executor should write the PID
* of the launched process
*/
public void activateContainer(ContainerId containerId, Path pidFilePath) {
try {
writeLock.lock();
this.pidFiles.put(containerId, pidFilePath);
} finally {
writeLock.unlock();
}
}
6.18. deactivateContainer
将容器标记为非活动。对于非活动容器,此方法无效。
/**
* Mark the container as inactive. For inactive containers this method has no effect.
*
* @param containerId the container ID
*/
public void deactivateContainer(ContainerId containerId) {
try {
writeLock.lock();
this.pidFiles.remove(containerId);
} finally {
writeLock.unlock();
}
}
6.19. pauseContainer [骗纸]
暂停容器。默认实现是引发kill事件。 需要自定义实现
/**
* Pause the container. The default implementation is to raise a kill event.
* Specific executor implementations can override this behavior.
* @param container
* the Container
*/
public void pauseContainer(Container container) {
LOG.warn(container.getContainerId() + " doesn't support pausing.");
throw new UnsupportedOperationException();
}
6.20. resumeContainer
恢复Container . 默认实现会忽略该事件. 可以自定义Executor实现 .
/**
* Resume the container from pause state. The default implementation ignores
* this event. Specific implementations can override this behavior.
* @param container
* the Container
*/
public void resumeContainer(Container container) {
LOG.warn(container.getContainerId() + " doesn't support resume.");
throw new UnsupportedOperationException();
}
6.21. cleanupBeforeRelaunch
在下一次启动容器之前执行所有清理。
/**
* Perform any cleanup before the next launch of the container.
* @param container container
*/
public void cleanupBeforeRelaunch(Container container)
throws IOException, InterruptedException {
if (container.getLocalizedResources() != null) {
// 获取资源软链
Map<Path, Path> symLinks = resolveSymLinks( container.getLocalizedResources(), container.getUser());
for (Map.Entry<Path, Path> symLink : symLinks.entrySet()) {
LOG.debug("{} deleting {}", container.getContainerId(), symLink.getValue());
// 删除资源
deleteAsUser(new DeletionAsUserContext.Builder()
.setUser(container.getUser())
.setSubDir(symLink.getValue())
.build());
}
}
}
6.22. getProcessId
获取进程id
/**
* Get the process-identifier for the container.
*
* @param containerID the container ID
* @return the process ID of the container if it has already launched,
* or null otherwise
*/
public String getProcessId(ContainerId containerID) {
String pid = null;
// 获取pid 文件路径
Path pidFile = pidFiles.get(containerID);
// If PID is null, this container hasn't launched yet.
if (pidFile != null) {
try {
//读取pid
pid = ProcessIdFileReader.getProcessId(pidFile);
} catch (IOException e) {
LOG.error("Got exception reading pid from pid-file " + pidFile, e);
}
}
return pid;
}
6.23. resolveSymLinks
处理操作系统文件的软链.
private Map<Path, Path> resolveSymLinks(Map<Path, List<String>> resources, String user) {
Map<Path, Path> symLinks = new HashMap<>();
for (Map.Entry<Path, List<String>> resourceEntry : resources.entrySet()) {
for (String linkName : resourceEntry.getValue()) {
if (new Path(linkName).getName().equals(WILDCARD)) {
// If this is a wildcarded path, link to everything in the directory from the working directory
// 如果这是一个通配符路径,请从工作目录链接到目录中的所有内容
for (File wildLink : readDirAsUser(user, resourceEntry.getKey())) {
symLinks.put(new Path(wildLink.toString()), new Path(wildLink.getName()));
}
} else {
symLinks.put(resourceEntry.getKey(), new Path(linkName));
}
}
}
return symLinks;
}