1、HADOOP无法加载本地库
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
解决:
增加调试信息:
export HADOOP_ROOT_LOGGER=DEBUG,console
重新执行发现日志:
DEBUG util.NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: /usr/hadoop/lib/native/libhadoop.so.1.0.0: /lib64/libc.so.6: version `GLIBC_2.14’ not found (required by /usr/hadoop/lib/native/libhadoop.so.1.0.0)
在hadoop-env.sh中加入java.library.path配置:
export HADOOP_OPTS="
H
A
D
O
O
P
O
P
T
S
−
D
j
a
v
a
.
n
e
t
.
p
r
e
f
e
r
I
P
v
4
S
t
a
c
k
=
t
r
u
e
−
D
j
a
v
a
.
l
i
b
r
a
r
y
.
p
a
t
h
=
HADOOP_OPTS -Djava.net.preferIPv4Stack=true -Djava.library.path=
HADOOPOPTS−Djava.net.preferIPv4Stack=true−Djava.library.path={HADOOP_HOME}/lib/native "
执行hadoop checknative成功
2、Spark引用不到hadoop本地库
Failed to load native-hadoop with error: no hadoop in java.library.path
解决:
既然我们的Hadoop/lib/native目录下面已经有了编译好的本地库,spark依赖于jdk,并不依赖于Hadoop,所以可能因此无法找到Hadoop的本地库文件
我们只需要在spark配置文件中,加入本地库的路径:
在spark-env.sh中加入native路径配置:
LD_LIBRARY_PATH=
L
D
L
I
B
R
A
R
Y
P
A
T
H
:
{LD_LIBRARY_PATH}:
LDLIBRARYPATH:{HADOOP_HOME}/lib/native
重新执行不在报错。
3、hadoop doCheckpoint报错
2018-12-11 11:10:45,497 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint
java.io.IOException: Inconsistent checkpoint fields.
解决:
如果没配置hadoop.tmp.dir加入这个配置
如果配置了,把hadoop.tmp.dir目录下的内容清空,重新启动。
4、Initial heap size set to a larger value than the maximum heap size
报错信息如下:
查看container日志:
[root@imonitor container_1544509748201_0001_01_000002]# more stdout
Error occurred during initialization of VM
Initial heap size set to a larger value than the maximum heap size
解决:
发现spark standalone模式时不报错,运行yarn-client模式时报错,
修改spark-defaults.conf中的spark.executor.extraJavaOptions参数
-Xms29G改为-Xms1G,如下:
spark.executor.extraJavaOptions -Xms1G -XX:InitialBootClassLoaderMetaspaceSize=128m -XX:MetaspaceSize=128m -XX:+UseG1GC -XX:MaxGCPauseMillis=500 -XX:+UnlockExperimentalVMOptions -XX:G1NewSizePercent=10 -XX:ParallelGCThreads=10 -XX:ConcGCThreads=10
注意几个内存参数
spark-defaults.conf中:
spark.driver.extraJavaOptions
spark.executor.extraJavaOptions
hadoop的yarn-env.sh中:
YARN_RESOURCEMANAGER_HEAPSIZE
YARN_NODEMANAGER_HEAPSIZE
hadoop的hadoop-env.sh中:
HADOOP_HEAPSIZE
5、修改日志级别
修改三处:
log4j.properties下:
#Define some default values that can be overridden by system properties
hadoop.root.logger=WARN,console
hadoop-env.sh下:
$ {HADOOP_HOME}/etc/hadoop/hadoop-env.sh,把INFO改为WARN即可:
#Command specific options appended to HADOOP_OPTS when specified
export HADOOP_NAMENODE_OPTS="-Xmx30720m -Dhadoop.security.logger=
H
A
D
O
O
P
S
E
C
U
R
I
T
Y
L
O
G
G
E
R
:
−
W
A
R
N
,
R
F
A
S
−
D
h
d
f
s
.
a
u
d
i
t
.
l
o
g
g
e
r
=
{HADOOP_SECURITY_LOGGER:-WARN,RFAS} -Dhdfs.audit.logger=
HADOOPSECURITYLOGGER:−WARN,RFAS−Dhdfs.audit.logger={HDFS_AUDIT_LOGGER:-WARN,NullAppender} $HADOOP_NAMENODE_OPTS"
hadoop-daemon.sh下:
$ {HADOOP_HOME}/sbin/hadoop-daemon.sh,也需要这样改一下:
export HADOOP_ROOT_LOGGER=${HADOOP_ROOT_LOGGER:-“WARN,RFA”}