hadoop内存的配置,涉及到两个方面:namenode/datanode/resourcemanager/nodemanager的HEAPSIZE环境变量
在配置文件/Configuration中影响MR运行的变量
HEAPSIZE环境变量
hadoop-env.sh此配置文件 hdfs和yarn脚本都会加载。hdfs是使用HADOOP_HEAPSIZE,而yarn使用新的环境变量YARN_HEAPSIZE。
hadoop/hdfs/yarn命令最终会把HEAPSIZE的参数转换了 JAVA_HEAP_MAX,把它作为启动参数传递给Javahadoop
hadoop命令是把 HADOOP_HEAPSIZE 转换为 JAVA_HEAP_MAX ,调用路径:hadoop -> hadoop-config.sh -> hadoop-env.shJAVA_HEAP_MAX=-Xmx1000m
# check envvars which might override default args
if [ "$HADOOP_HEAPSIZE" != "" ]; then
#echo "run with heapsize $HADOOP_HEAPSIZE"
JAVA_HEAP_MAX="-Xmx""$HADOOP_HEAPSIZE""m"
#echo $JAVA_HEAP_MAX
fihdfs
hdfs其实就是从hadoop脚本里面分离出来的。调用路径:hdfs -> hdfs-config.sh -> hadoop-config.sh -> hadoop-env.shyarn
yarn也调用了hadoop-env.sh,但是设置内存的参数变成了 YARN_HEAPSIZE 。调用路径:yarn -> yarn-config.sh -> hadoop-config.sh -> hadoop-env.shJAVA_HEAP_MAX=-Xmx1000m
# For setting YARN specific HEAP sizes please use this
# Parameter and set appropriately
# YARN_HEAPSIZE=1000
# check envvars which might override default args
if [ "$YARN_HEAPSIZE" != "" ]; then<