在搭建好的hadoop集群中挑选三个节点(三台机器)
去官网下载对应hadoop版本的spark,如果版本与已安装的hadoop不一致,会报错。
http://spark.apache.org/downloads.html
下载,上传到一台机器,解压,配置文件,发送到其他机器。
需要配置conf/下的spark-env.sh.template、slaves.template,先做一个备份然后在spark-env.sh、 slaves中配置
在conf/目录下:
cp slaves.template slaves
cp spark-env.sh.template spark-env.sh
修改spark-env.sh:
export SPARK_MASTER_HOST=master
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_CORES=2
export SPARK_WORKER_MEMORY=3g
export SPARK_MASTER_WEBUI_PORT=8888
export SPARK_MASTER_HOST=master
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_CORES=2
export SPARK_WORKER_MEMORY=3g
export SPARK_MASTER_WEBUI_PORT=8888
# Options for the daemons used in the standalone deploy mode
# - SPARK_MASTER_HOST, to bind the master to a different IP address or hostname
# - SPARK_MASTER_PORT / SPARK_MASTER_WEBUI_PORT, to use non-default ports for the master
# - SPARK_MASTER_OPTS, to set config properties only for the master (e.g. "-Dx=y")
# - SPARK_WORKER_CORES, to set the number of cores to use on this machine
# - SPARK_WORKER_MEMORY, to set how much total memory workers have to give executors (e.g. 1000m, 2g)
# - SPARK_WORKER_PORT / SPARK_WORKER_WEBUI_PORT, to use non-default ports for the worker
# - SPARK_WORKER_DIR, to set the working directory of worker processes
# - SPARK_WORKER_OPTS, to set config properties only for the worker (e.g. "-Dx=y")
# - SPARK_DAEMON_MEMORY, to allocate to the master, worker and history server themselves (default: 1g).
# - SPARK_HISTORY_OPTS, to set config properties only for the history server (e.g. "-Dx=y")
# - SPARK_SHUFFLE_OPTS, to set config properties only for the external shuffle service (e.g. "-Dx=y")
# - SPARK_DAEMON_JAVA_OPTS, to set config properties for all daemons (e.g. "-Dx=y")
# - SPARK_DAEMON_CLASSPATH, to set the classpath for all daemons
# - SPARK_PUBLIC_DNS, to set the public dns name of the master or workers
修改slaves:
# A Spark Worker will be started on each of the machines listed below.
slave1
slave2
在sbin目录下:./start-all.sh,在我的机器中会报错:
刚开始说没有配置JAVA_HOME,后面开始说
failed to launch: nice -n 0 /soft/spark/bin/spark-class org.apache.spark.deploy.worker
参考:https://blog.csdn.net/qq_40707033/article/details/93210838进行修改
在/spark/sbin/spark-config.sh 加入JAVA_HOME的路径。
在sbin目录下:./start-all.sh
在master下jps发现Master,在slave1下jps发现Worker,slave2同样,成功。
访问:http://master:8888/ 出现如下界面即成功。
(在spark-env.sh中配置的端口号SPARK_MASTER_WEBUI_PORT=8888)