安装spark(集群环境, 貌似spark的master节点必须与hadoop的master节点一致)
cd /soft
tar -zxvf spark-2.2.1-bin-hadoop2.7.tgz -C /usr/local/
cd /usr/local/spark-2.2.1-bin-hadoop2.7
环境变量:
echo "export SPARK_HOME=/usr/local/spark-2.2.1-bin-hadoop2.7" >> /etc/profile
echo -e 'export PATH=$PATH:$SPARK_HOME/bin'>> /etc/profile
source /etc/profile
对/usr/local/spark-2.2.1-bin-hadoop2.7/conf目录下的文件进行配置:
cd /usr/local/spark-2.2.1-bin-hadoop2.7/conf
cp spark-env.sh.template spark-env.sh
cp slaves.template slaves
编辑spark-env.sh:vim /usr/local/spark-2.2.1-bin-hadoop2.7/conf/spark-env.sh
添加:
export SCALA_HOME=/usr/local/scala-2.12.4
export JAVA_HOME=/usr/java/jdk1.8.0
export SPARK_HOME=/usr/local/spark-2.2.1-bin-hadoop2.7
export SPARK_MASTER_IP=node1
export SPARK_EXECUTOR_MEMORY=250m
export SPARK_WORKER_MEMORY=250m
export LD_LIBRARY_PATH=/usr/local/hadoop-2.7.4/lib/native
export HADOOP_HOME=/usr/local/hadoop-2.7.4
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
注意:没有环境变量LD_LIBRARY_PATH,spark-shell启动报出警告:Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
编辑slaves:vim /usr/local/spark-2.2.1-bin-hadoop2.7/conf/slaves
里面的内容设置为:
node2
node3
拷贝到其他节点
sudo scp -r /usr/local/spark-2.2.1-bin-hadoop2.7 node2:/usr/local/
sudo scp -r /usr/local/spark-2.2.1-bin-hadoop2.7 node3:/usr/local/
授权
chown -R hadoop /usr/local/spark-2.2.1-bin-hadoop2.7
chgrp -R hadoop /usr/local/spark-2.2.1-bin-hadoop2.7
启动:
cd /usr/local/spark-2.2.1-bin-hadoop2.7/sbin
执行启动脚本: ./start-all.sh
Spark集群提供的URL: http://192.168.209.129:8080/
第二步,调用Spark自带的计算圆周率的Demo:
单机模式调用Demo:
cd /usr/local/spark-2.2.1-bin-hadoop2.7
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master local examples/jars/spark-examples_2.11-2.2.1.jar
使用用Standalone模式(集群模式)运行Demo:
cd /usr/local/spark-2.2.1-bin-hadoop2.7
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://192.168.209.129:7077 examples/jars/spark-examples_2.11-2.2.1.jar
centos7搭建spark集群环境笔记
最新推荐文章于 2022-03-11 16:55:30 发布