CDH 集群中初始化pyspark 报错"Error while instantiating ‘org.apache.spark.sql.hive.HiveSessionStateBuilder’:"
CDH 集群中初始化pyspark 报错"Error while instantiating ‘org.apache.spark.sql.hive.HiveSessionStateBuilder’:",具体错误如下:
[root@szsjhl-cdh-test-10-9-251-30.belle.lan:/opt/cloudera/parcels/CDH-6.0.0-1.cdh6.0.0.p0.537114/lib/spark/bin]
# pyspark
Python 2.7.15 |Anaconda, Inc.| (default, May 1 2018, 23:32:55)
[GCC 7.2.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Traceback (most recent call last):
File "/opt/cloudera/parcels/CDH-6.0.0-1.cdh6.0.0.p0.537114/lib/spark/python/pyspark/shell.py", line 45, in <module>
spark = SparkSession.builder\
File "/opt/cloudera/parcels/CDH-6.0.0-1.cdh6.0.0.p0.537114/lib/spark/python/pyspark/sql/session.py", line 183, in getOrCreate
session._jsparkSession.sessionState().conf().setConfString(key, value)
File "/opt/cloudera/parcels/CDH-6.0.0-1.cdh6.0.0.p0.537114/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
File "/opt/cloudera/parcels/CDH-6.0.0-1.cdh6.0.0.p0.537114/lib/spark/python/pyspark/sql/utils.py", line 79, in deco
raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.IllegalArgumentException: u"Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':"
原因:pyspark初始化时需要访问hive数据的权限,由于是在root用户下执行该操作导致报错。
解决方法:切换为hive用户执行命令
[root@szsjhl-cdh-test-10-9-251-30.belle.lan:/opt/cloudera/parcels/CDH-6.0.0-1.cdh6.0.0.p0.537114/lib/spark/bin]
# sudo -u hive pyspark
Python 2.7.15 |Anaconda, Inc.| (default, May 1 2018, 23:32:55)
[GCC 7.2.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
19/05/23 20:26:01 WARN lineage.LineageWriter: Lineage directory /var/log/spark/lineage doesn't exist or is not writable. Lineage for this application will be disabled.
19/05/23 20:26:03 WARN lineage.LineageWriter: Lineage directory /var/log/spark/lineage doesn't exist or is not writable. Lineage for this application will be disabled.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 2.2.0-cdh6.0.0
/_/
Using Python version 2.7.15 (default, May 1 2018 23:32:55)
SparkSession available as 'spark'.
>>>
Shylin