hive踩坑记录1
使用大数据平台补录数据出错,错误如下:
Error: GC overhead limit exceeded
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRed
during job, obtaining debugging information
原因:内存不足,添加如下参数:
set mapreduce.reduce.memory.mb=5120
set mapreduce.map.memory.mb=5120
set mapreduce.map.java.opts=-Xmx4096m -XX:+UseConcMarkSweepGC
set hive.groupby.skewindata=true
set hive.optimize.skewjoin=true
(运行速度都快了)!