hive on spark环境下在hive中创建表成功,但是执行插入数据语句报错。
hive (default)> insert into table student values(1,'aaa');
Query ID = root_20230908072041_43a3c5c4-a88b-4ed2-9f5d-f1323d673325
Total jobs = 1
Launching Job 1 out of 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark client for Spark session c7d2e6bd-3e91-463f-b9e7-a52581a70989)'
FAILED: Execution Error, return code 30041 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create Spark client for Spark session c7d2e6bd-3e91-463f-b9e7-a52581a70989
以下整理了几种可能存在的错误原因:
1、到spark的/opt/software/spark/spark-3.0.0/conf目录下修改配置文件
mv spark-env.sh.template spark-env.shvi spark-env.sh
然后在该文件中添加:
export SPARK_DIST_CLASSPATH=$(hadoop classpath)
2、Spark没有启动;
在/opt/software/spark/spark-3.0.0/sbin路径下启动spark
./start-all.sh
3、在hive/conf/hive-site.xml中增加
<!--Hive和Spark连接超时时间-->
<property>
<name>hive.spark.client.connect.timeout</name>
<value>1000000ms</value>
</property><property>
<name>hive.spark.client.server.connect.timeout</name>
<value>1000000ms</value>
</property>
4、查看hive日志发现缺少resources-types.xml配置文件
more /opt/software/hive/hive-3.1.2/logs/hive.log
解决:在/opt/software/hadoop/hadoop-3.1.3/etc/hadoop目录下创建resource-types.xml文件
<?xml version="1.0"?>
<configuration>
<property>
<name>yarn.resource-type</name>
<value>resource1, resource2</value>
</property><property>
<name>yarn.resource-type.resource1.units</name>
<value>G</value>
</property><property>
<name>yarn.resource-type.resource2.minimum</name>
<value>1</value>
</property><property>
<name>yarn.resource-type.resource2.maximum</name>
<value>1024</value>
</property>
</configuration>
官网上type后面是有s的,删除就好了
这是整理出来的几个错误可能还有别的,比如hive和spark版本兼容性问题。。。。