Spark 集群搭建
a)复制spark目录到其他主机
b)配置其他主机的所有环境变量
[/etc/profile]
SPARK_HOME
PATH
c)配置master节点的slaves
[/soft/spark/conf/slaves]
s202
s203
s204
d)启动spark集群
/soft/spark/sbin/start-all.sh
注:此处有可能报错: java home is not set
事实上我们在环境变量肯定是设置了java home
解决办法:在spark的sbin目录下的 spark-config.sh中添加java home
[/soft/spark/sbin/spark-config.sh]
export JAVA_HOME=/soft/jdk
e)查看进程
$>xcall.jps jps
master //s201
worker //s202
worker //s203
worker //s204
e)webui
http://s201:8080/
提交作业jar到完全分布式spark集群
1.需要启动hadoop集群(只需要hdfs)
$>start-dfs.sh
2.put文件到hdfs.
3.运行spark-submit
$>spark-submit
--master spark://s201:7077
--name MyWordCount
--class com.mao.scala.scala.WordCountScala
SparkDemo1-1.0-SNAPSHOT.jar
hdfs://s201:8020/user/centos/test.txt