在Spark官网下载Spark时可以选择下载带Hadoop和不带Hadoop的版本。
如果选择下载不带Hadoop的版本,需要在${SPARK_HOME}/conf/spark-env.sh
中export环境变量SPARK_DIST_CLASSPATH
,否则会在启动时遇到A JNI error has occurred, please check your installation and try again;而下载带Hadoop的Spark就不需要。
应该这样配置SPARK_DIST_CLASSPATH
:
### in conf/spark-env.sh ###
# If 'hadoop' binary is on your PATH
export SPARK_DIST_CLASSPATH=$(hadoop classpath)
# With explicit path to 'hadoop' binary
export SPARK_DIST_CLASSPATH=$(/path/to/hadoop/bin/hadoop classpath)
# Passing a Hadoop configuration directory
export SPARK_DIST_CLASSPATH=$(hadoop --config /path/to/configs classpath)
官方的说法戳这里