Spark搭建
1、下载spark-2.3.0-bin-hadoop2.6tgz
2、上传到APP目录下
3、将spark-2.1.0-bin-hadoop2.7.tmp解压缩
[hadoop@cdh01 app]$ tar -zxvf spark-2.1.0-bin-hadoop2.7.tgz
4、删除安装包
[hadoop@cdh01 app]$ rm -f spark-2.1.0-bin-hadoop2.7.tgz
5、创建软连接
[hadoop@cdh01 app]$ ln -s spark-2.1.0-bin-hadoop2.7 spark
6、上传scala-2.11.8.tgz到app目录
7、解压缩
[hadoop@cdh01 app]$ tar -zxvf scala-2.11.8.tgz
8、删除安装包
[hadoop@cdh01 app]$ rm -f scala-2.11.8.tgz
9、创建软连接
[hadoop@cdh01 app]$ ln -s scala-2.11.8 scala
10、配置环境变量
[hadoop@cdh01 app]$ vi ~/.bashrc
# .bashrc
# Source global definitions
if [ -f /etc/bashrc ]; then
. /etc/bashrc
fi
# User specific aliases and functions
JAVA_HOME=/home/hadoop/app/jdk
SCALA_HOME=/home/hadoop/app/scala
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
PATH=$JAVA_HOME/bin:$SCALA_HOME/bin:/home/hadoop/tools:/home/hadoop/app/hive/lib:$PATH
export JAVA_HOME CLASSPATH PATH SCALA_HOME
11、生效
[hadoop@cdh01 app]$ source ~/.bashrc
12、查看版本
[hadoop@cdh01 app]$ scala -version
13、分发脚本
[hadoop@cdh01 app]$ deploy.sh scala-2.11.8 /home/hadoop/app/ slave
14、其他两个节点重复配置10–》13
spark搭建
1、一删除spark
[hadoop@cdh01 app]$ rm -rf spark
2、复制
[hadoop@cdh01 app]$ cp -r spark-2.1.0-bin-hadoop2.7 spark-alone
3、软连接
ln -s spark-2.1.0-bin-hadoop2.7 spark
4、进入目录
[hadoop@cdh01 spark]$ cd conf
5、修改文件名
[hadoop@cdh01 conf]$ mv spark-env.sh.template spark-env.sh
6、修改文件
[hadoop@cdh01 conf]$ vi spark-env.sh
# Options read by executors and drivers running inside the cluster
export JAVA_HOME=/home/hadoop/app/jdk
export HADOOP_CONF_DIR=/home/hadoop/app/hadoop/etc/hadoop
export HADOOP_HOME=/home/hadoop/app/hadoop
SPARK_MASTER_WEBUI_PORT=8888
SPARK_DAEMON_JAVA_OPTS="-Dspark.depoly.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=cdh01:2181,cdh02:2181.cdh03:2181 -Dspark.deploy.zookeeper.dir=/myspark"
SPARK_CONF_DIR=/home/hadoop/app/spark/conf
SPARK_LOG_DIR=/home/hadoop/data/spark/logs
SPARK_PID_DIR=/home/hadoop/data/spark/logs
7、修改
[hadoop@cdh01 conf]$ mv slaves.template slaves
[hadoop@cdh01 conf]$ vi slaves
cdh01
cdh02
cdh03
8、同步到其他节点
[hadoop@cdh01 app]$ deploy.sh spark-2.1.0-bin-hadoop2.7 /home/hadoop/app/ slave
9、创建软连接
[hadoop@cdh01 app] ln -s spark-2.1.0-bin-hadoop2.7 spark
10、通过远程脚本创建目录
[hadoop@cdh01 app]$ runRemoteCmd.sh “mkdir -p /home/hadoop/app/spark/logs” all
11、将hadoop/etc/hadoop节点拷贝到spark/conf
[hadoop@cdh01 hadoop]$ cp core-site.xml /home/hadoop/app/spark/conf/
12、同步到两个节点
[hadoop@cdh01 hadoop]$ deploy.sh core-site.xml /home/hadoop/app/spark/conf/ slave
13、启动spark
[hadoop@cdh01 spark]./bin/spark-shell
spark测试
1、启动zookeep
[hadoop@cdh01 hadoop]$ runRemoteCmd.sh “/home/hadoop/app/zookeeper/bin/zkServer.sh start” all
2、启动hdfs
[hadoop@cdh01 hadoop]$ sbin/start-dfs.sh
2、在主节点启动spark
[hadoop@cdh01 spark]sbin/start-all.sh
3、启动另一个
[hadoop@cdh02 spark]sbin/start-master.sh
4、界面方位,在游览器输入
cdh01:8888
5、启动spark shell
[hadoop@hadoop1 ~]$ /home/hadoop/app/spark/bin/spark-shell
–master spark://hadoop1:7077
–executor-memory 500m
–total-executor-cores 1