背景
CentOS7.5中三台机器(hadoop102、hadoop103、hadoop104)的hadoop集群
问题描述
数仓建设:dwd层向dws层导入数据时,脚本执行报错
报错代码
FAILED: SemanticException Failed to get a spark session:
org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create Spark client due to
invalid resource request: Required executor memory (2048), overhead (384 MB), and
PySpark memory (0 MB) is above the max threshold (2048 MB)
产生原因
yarn-site中配置的参数yarn.nodemanager.resource.memory-mb,yarn.scheduler.maximum-allocation-mb内存不够
解决方案
对参数的内存调整,从2048调整为4096
<!-- yarn容器允许分配的最大最小内存 -->
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>512</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>4096</value>
</property>
<!-- yarn容器允许管理的物理内存大小 -->
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>4096</value>
</property>
我是dyson不只是吹风机,若是对大数据-数据仓库技术感兴趣的可以加我沟通交流,一起进步。VX:daijun1211
ps:若文章侵权、触犯隐私请联系作者删除,谢谢~~