kylin build cube Java heap space OutOfMemory
Apache Kylin版本:2.6.1
一、问题
build cube时,出现Communication exception: java.lang.OutOfMemoryError: Java heap space
日志:
Logged in as: dr.who
Logs for container_1559093308035_0009_01_000042
ResourceManager
* RM Home
NodeManager
Tools
Showing 4096 bytes. Click here for full log
lin.engine.mr.KylinReducer: Accepting Reducer Key with ordinal: 5200001
2019-05-29 11:26:25,550 INFO [main] org.apache.kylin.engine.mr.KylinReducer: Do reduce, available memory: 127m
2019-05-29 11:26:26,448 INFO [main] org.apache.kylin.engine.mr.KylinReducer: Accepting Reducer Key with ordinal: 5300001
2019-05-29 11:26:26,448 INFO [main] org.apache.kylin.engine.mr.KylinReducer: Do reduce, available memory: 127m
2019-05-29 11:26:27,495 INFO [main] org.apache.kylin.engine.mr.KylinReducer: Accepting Reducer Key with ordinal: 5400001
2019-05-29 11:26:27,496 INFO [main] org.apache.kylin.engine.mr.KylinReducer: Do reduce, available memory: 116m
2019-05-29 11:26:29,383 INFO [main] org.apache.kylin.engine.mr.KylinReducer: Accepting Reducer Key with ordinal: 5500001
2019-05-29 11:26:29,383 INFO [main] org.apache.kylin.engine.mr.KylinReducer: Do reduce, available memory: 87m
2019-05-29 11:26:30,489 INFO [main] org.apache.kylin.engine.mr.KylinReducer: Accepting Reducer Key with ordinal: 5600001
2019-05-29 11:26:30,489 INFO [main] org.apache.kylin.engine.mr.KylinReducer: Do reduce, available memory: 59m
2019-05-29 11:26:32,278 INFO [main] org.apache.kylin.engine.mr.KylinReducer: Accepting Reducer Key with ordinal: 5700001
2019-05-29 11:26:32,278 INFO [main] org.apache.kylin.engine.mr.KylinReducer: Do reduce, available memory: 63m
2019-05-29 11:26:35,014 INFO [main] org.apache.kylin.engine.mr.KylinReducer: Accepting Reducer Key with ordinal: 5800001
2019-05-29 11:26:35,014 INFO [main] org.apache.kylin.engine.mr.KylinReducer: Do reduce, available memory: 68m
2019-05-29 11:26:38,503 INFO [main] org.apache.kylin.engine.mr.KylinReducer: Accepting Reducer Key with ordinal: 5900001
2019-05-29 11:26:38,503 INFO [main] org.apache.kylin.engine.mr.KylinReducer: Do reduce, available memory: 53m
2019-05-29 11:27:03,256 INFO [main] org.apache.kylin.engine.mr.KylinReducer: Accepting Reducer Key with ordinal: 6000001
2019-05-29 11:27:03,257 INFO [main] org.apache.kylin.engine.mr.KylinReducer: Do reduce, available memory: 53m
2019-05-29 11:32:20,401 INFO [communication thread] org.apache.hadoop.mapred.Task: Communication exception: java.lang.OutOfMemoryError: Java heap space
at java.io.BufferedReader.<init>(BufferedReader.java:105)
at java.io.BufferedReader.<init>(BufferedReader.java:116)
at org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:541)
at org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.updateProcessTree(ProcfsBasedProcessTree.java:223)
at org.apache.hadoop.mapred.Task.updateResourceCounters(Task.java:894)
at org.apache.hadoop.mapred.Task.updateCounters(Task.java:1045)
at org.apache.hadoop.mapred.Task.access$500(Task.java:82)
at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:782)
at java.lang.Thread.run(Thread.java:748)
2019-05-29 11:34:19,426 INFO [communication thread] org.apache.hadoop.mapred.Task: Communication exception: java.lang.OutOfMemoryError: Java heap space
at java.util.regex.Matcher.<init>(Matcher.java:225)
at java.util.regex.Pattern.matcher(Pattern.java:1093)
at org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.getProcessList(ProcfsBasedProcessTree.java:508)
at org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.updateProcessTree(ProcfsBasedProcessTree.java:210)
at org.apache.hadoop.mapred.Task.updateResourceCounters(Task.java:894)
at org.apache.hadoop.mapred.Task.updateCounters(Task.java:1045)
at org.apache.hadoop.mapred.Task.access$500(Task.java:82)
at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:782)
at java.lang.Thread.run(Thread.java:748)
2019-05-29 11:35:37,161 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: Java heap space
2019-05-29 11:35:37,165 FATAL [communication thread] org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[communication thread,5,main] threw an Error. Shutting down now...
java.lang.OutOfMemoryError: Java heap space
2019-05-29 11:35:37,169 INFO [communication thread] org.apache.hadoop.util.ExitUtil: Halt with status -1 Message: HaltException
二、解决
步骤一
思路:
增大kylin的内存
看了下官网 http://kylin.apache.org/cn/docs/install/configuration.html
在 $KYLIN_HOME/conf/setenv.sh 中存在对 KYLIN_JVM_SETTINGS 的两种示例配置。
默认配置使用的内存较少,用户可以根据自己的实际情况,注释掉默认配置并取消另一配置前的注释符号以启用另一配置,从而为 Kylin 实例分配更多的内存资源,该项配置的默认值如下:
export KYLIN_JVM_SETTINGS="-Xms1024M -Xmx4096M -Xss1024K -XX`MaxPermSize=512M -verbose`gc -XX`+PrintGCDetails -XX`+PrintGCDateStamps -Xloggc`$KYLIN_HOME/logs/kylin.gc.$$ -XX`+UseGCLogFileRotation -XX`NumberOfGCLogFiles=10 -XX`GCLogFileSize=64M"
# export KYLIN_JVM_SETTINGS="-Xms16g -Xmx16g -XX`MaxPermSize=512m -XX`NewSize=3g -XX`MaxNewSize=3g -XX`SurvivorRatio=4 -XX`+CMSClassUnloadingEnabled -XX`+CMSParallelRemarkEnabled -XX`+UseConcMarkSweepGC -XX`+CMSIncrementalMode -XX`CMSInitiatingOccupancyFraction=70 -XX`+UseCMSInitiatingOccupancyOnly -XX`+DisableExplicitGC -XX`+HeapDumpOnOutOfMemoryError -verbose`gc -XX`+PrintGCDetails -XX`+PrintGCDateStamps -Xloggc`$KYLIN_HOME/logs/kylin.gc.$$ -XX`+UseGCLogFileRotation -XX`NumberOfGCLogFiles=10 -XX`GCLogFileSize=64M"
改成如下配置
#export KYLIN_JVM_SETTINGS="-Xms1024M -Xmx4096M -Xss1024K -XX`MaxPermSize=512M -verbose`gc -XX`+PrintGCDetails -XX`+PrintGCDateStamps -Xloggc`$KYLIN_HOME/logs/kylin.gc.$$ -XX`+UseGCLogFileRotation -XX`NumberOfGCLogFiles=10 -XX`GCLogFileSize=64M"
export KYLIN_JVM_SETTINGS="-Xms16g -Xmx16g -XX`MaxPermSize=512m -XX`NewSize=3g -XX`MaxNewSize=3g -XX`SurvivorRatio=4 -XX`+CMSClassUnloadingEnabled -XX`+CMSParallelRemarkEnabled -XX`+UseConcMarkSweepGC -XX`+CMSIncrementalMode -XX`CMSInitiatingOccupancyFraction=70 -XX`+UseCMSInitiatingOccupancyOnly -XX`+DisableExplicitGC -XX`+HeapDumpOnOutOfMemoryError -verbose`gc -XX`+PrintGCDetails -XX`+PrintGCDateStamps -Xloggc`$KYLIN_HOME/logs/kylin.gc.$$ -XX`+UseGCLogFileRotation -XX`NumberOfGCLogFiles=10 -XX`GCLogFileSize=64M"
步骤二
思路:key分布不均匀,导致某一个reduce所处理的数据超过预期,导致jvm频繁GC。增大mapreduce内存也能改善。
开源Hadoop请修改mapred-default.xml
mapreduce.reduce.memory.mb=4GB
mapreduce.map.memory.mb=4GB
由于笔者是CDH遇到的问题,依次点击yarn/配置,输入memory,修改如下: