1)NameNode内存计算
每个文件块大概占用150byte,一台服务器128G内存为例,能存储多少文件块呢?
128 * 1024 * 1024 * 1024 / 150Byte ≈ 9.1亿
G MB KB Byte
2)Hadoop2.x系列,配置NameNode内存
NameNode内存默认2000m,如果服务器内存4G,NameNode内存可以配置3g。在hadoop-env.sh文件中配置如下。
HADOOP_NAMENODE_OPTS=-Xmx3072m
3)Hadoop3.x系列,配置NameNode内存
(1)hadoop-env.sh中描述Hadoop的内存是动态分配的
# The maximum amount of heap to use (Java -Xmx). If no unit
# is provided, it will be converted to MB. Daemons will
# prefer any Xmx setting in their respective _OPT variable.
# There is no default; the JVM will autoscale based upon machine
# memory size.
# export HADOOP_HEAPSIZE_MAX=
# The minimum amount of heap to use (Java -Xms). If no unit
# is provided, it will be converted to MB. Daemons will
# prefer any Xms setting in their respective _OPT variable.
# There is no default; the JVM will autoscale based upon machine
# memory size.
# export HADOOP_HEAPSIZE_MIN=
(2)查看NameNode占用内存
(3)查看DataNode占用内存
查看发现在自己的虚拟机Hadoop102上的NameNode和DataNode占用内存都是自动分配的,且相等。不是很合理。
实际上生产的参数配置应该按如下所示:
具体修改:hadoop-env.sh
export HDFS_NAMENODE_OPTS="-Dhadoop.security.logger=INFO,RFAS -Xmx1024m"
export HDFS_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS -Xmx4096m"
上面两个值应该是生产上面配置的最小值,所以要求在生产上面的机器性能要足够支撑namenode和datanode的内存空间。