java.lang.OutOfMemoryError: GC overhead limit exceeded

最新推荐文章于 2024-05-14 01:11:44 发布

action825

最新推荐文章于 2024-05-14 01:11:44 发布

阅读量604

点赞数

本文链接：https://blog.csdn.net/action825/article/details/107633931

版权

错误信息

执行Sql：

create table xxx.gqj_test_1 as     
select applseq,
max(case when rule_id='2005302' then  result end)  as 6_month,
max(case when rule_id='2005304' then  result end)  as 12_month,
max(case when rule_id='2005306' then  result end)  as 18_month,
max(case when rule_id='2005308' then  result end)  as 24_month,
max(case when rule_id='2005310' then  result end)  as 60_month
from  xxx.xxx
group by applseq;

执行的日志如下：

Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 9
2020-07-28 10:58:14,413 Stage-1 map = 0%,  reduce = 0%
2020-07-28 10:59:15,267 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 874.88 sec
2020-07-28 11:00:16,107 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 115.55 sec
2020-07-28 11:01:16,513 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 1388.63 sec
2020-07-28 11:02:16,915 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 1002.31 sec
2020-07-28 11:03:17,553 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 1212.87 sec
2020-07-28 11:04:17,975 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 369.32 sec
2020-07-28 11:05:18,281 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 1677.58 sec
2020-07-28 11:05:23,402 Stage-1 map = 100%,  reduce = 100%
MapReduce Total cumulative CPU time: 27 minutes 57 seconds 580 msec
Ended Job = job_1589431406497_162391 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1589431406497_162391_m_000001 (and more) from job job_1589431406497_162391

Task with the most failures(4): 
-----
Task ID:
  task_1589431406497_162391_m_000001

URL:
  http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1589431406497_162391&tipid=task_1589431406497_162391_m_000001
-----
Diagnostic Messages for this Task:
Error: GC overhead limit exceeded

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 3  Reduce: 9   Cumulative CPU: 1677.58 sec   HDFS Read: 0 HDFS Write: 0 HDFS EC Read: 0 FAIL
Total MapReduce CPU Time Spent: 27 minutes 57 seconds 580 msec

日志中的错误：

ve.ql.exec.MapOperator: MAP[0]: records read - 1000001
2020-07-28 10:07:12,441 INFO [main] org.apache.hadoop.hive.ql.exec.MapOperator: MAP[0]: records read - 2000001
2020-07-28 10:07:13,910 INFO [main] org.apache.hadoop.hive.ql.exec.MapOperator: MAP[0]: records read - 3000001
2020-07-28 10:07:21,638 INFO [main] org.apache.hadoop.hive.ql.exec.MapOperator: MAP[0]: records read - 4000001
2020-07-28 10:07:24,024 INFO [main] org.apache.hadoop.hive.ql.exec.MapOperator: MAP[0]: records read - 5000001
2020-07-28 10:07:28,496 INFO [main] org.apache.hadoop.hive.ql.exec.MapOperator: MAP[0]: records read - 6000001
2020-07-28 10:07:39,237 INFO [main] org.apache.hadoop.hive.ql.exec.MapOperator: MAP[0]: records read - 7000001
2020-07-28 10:08:45,822 INFO [main] org.apache.hadoop.hive.ql.exec.MapOperator: MAP[0]: Total records read - 7247916. abort - true
2020-07-28 10:08:45,822 INFO [main] org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0, RECORDS_IN:7247916, 
2020-07-28 10:08:45,822 INFO [main] org.apache.hadoop.hive.ql.exec.ReduceSinkOperator: RS[3]: Total records written - 0. abort - true
2020-07-28 10:08:45,822 INFO [main] org.apache.hadoop.hive.ql.exec.ReduceSinkOperator: RECORDS_OUT_INTERMEDIATE:0,
2020-07-28 10:08:46,311 ERROR [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.nio.ByteBuffer.wrap(ByteBuffer.java:373)
    at org.apache.hadoop.io.Text.decode(Text.java:389)
    at org.apache.hadoop.io.Text.toString(Text.java:280)
    at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveJavaObject(WritableStringObjectInspector.java:46)
    at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveJavaObject(WritableStringObjectInspector.java:26)
    at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:412)
    at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMax$GenericUDAFMaxEvaluator.merge(GenericUDAFMax.java:125)

查看了SQL的执行计划：

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: yk_zjbt_7
            Statistics: Num rows: 28163209 Data size: 529005690 Basic stats: COMPLETE Column stats: NONE
            Select Operator
              expressions: applseq (type: string), rule_id (type: string), result (type: string)
              outputColumnNames: applseq, rule_id, result
              Statistics: Num rows: 28163209 Data size: 529005690 Basic stats: COMPLETE Column stats: NONE
              Group By Operator
                aggregations: max(CASE WHEN ((rule_id = '2005302')) THEN (result) END), max(CASE WHEN ((rule_id = '2005304')) THEN (result) END), max(CASE WHEN ((rule_id = '2005306')) THEN (result) END), max(CASE WHEN ((rule_id = '2005308')) THEN (result) END), max(CASE WHEN ((rule_id = '2005310')) THEN (result) END)
                keys: applseq (type: string)
                mode: hash
                outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
                Statistics: Num rows: 28163209 Data size: 529005690 Basic stats: COMPLETE Column stats: NONE
                Reduce Output Operator
                  key expressions: _col0 (type: string)
                  sort order: +
                  Map-reduce partition columns: _col0 (type: string)
                  Statistics: Num rows: 28163209 Data size: 529005690 Basic stats: COMPLETE Column stats: NONE
                  value expressions: _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: string), _col5 (type: string)

文件的大小：
[dp@hdop-applb12 ~]$ hdfs dfs -ls /user/hive/warehouse/xxxx.db/yk_zjbt_7/
Found 1 items
-rwxrwxr-x 3 hive hive 557168899 2020-07-27 18:39 /user/hive/warehouse/xxxx.db/yk_zjbt_7/000000_0
[dp@hdop-applb12 ~]$ hdfs dfs -du -s -h /user/hive/warehouse/xxxx.db/yk_zjbt_7/
531.4 M 1.6 G /user/hive/warehouse/xxxx.db/yk_zjbt_7

比较特殊的地方是 Map阶段执行了aggregation
根据错误显示，错误是发生在map阶段的。错误的原因是内存过小导致的。
数据文件大约是531M，3个Map任务，每个大约是180M不到的数据，普通的Map任务不应该会导致内存不足。

解决方法

解决方法有3个：

增加内存
set mapreduce.map.java.opts=-Xmx1024m;
Time taken: 293.026 seconds
增加map的数量
set mapred.max.split.size=5000000;（设置了set mapred.max.split.size=50000000; 增加到9个map，还是报错）
Time taken: 619.508 seconds
禁止map的aggr
hive.map.aggr=false;（禁用在map中会做部分聚集操作，这样map阶段使用的内存降低，但效率会降低，如果上面的方法能解决问题，不建议使用这种方法）

action825

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
java.lang.OutOfMemoryError: GC overhead limit exceeded

错误信息执行Sql：create table xxx.gqj_test_1 as select applseq,max(case when rule_id='2005302' then result end) as 6_month,max(case when rule_id='2005304' then result end) as 12_month,max(case when rule_id='2005306' then result end) as 18_month,m
复制链接

扫一扫