Spark3.1.2高可用部署文档
解压、改名
tar -zxvf spark-3.1.2-bin-hadoop2.7.tgz -C /opt/
cd /opt/
mv spark-3.1.2-bin-hadoop2.7/ spark
cd spark/conf
添加Hadoop配置文件的软链接
ln -s /opt/hadoop/etc/hadoop/core-site.xml
ln -s /opt/hadoop/etc/hadoop/hdfs-site.xml
添加hive-site.xml配置文件
touch hive-site.xml
vim hive-site.xml
hive-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://server3:3306/hive_db?createDatabaseIfNotExist=true&useSSL=false</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>datanucleus.schema.autoCreateTables</name>
<value>true</value>
</property>
</configuration>
启动Hive初始化元数据仓库(需要安装Hive2.3.x)
注:spark初始化元数据仓库不太聪明,自己去hive安装目录手动初始化吧。
初始化命令
schematool -dbType mysql -initSchema
编辑spark-env.sh
添加如下配置
export HADOOP_CONF_DIR=/opt/hadoop/etc/hadoop
export JAVA_HOME=/opt/jdk
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER
-Dspark.deploy.zookeeper.url=server1:2181,server2:2181,server3:2181
-Dspark.deploy.zookeeper.dir=/spark"
编辑workers
这里我已经配置了ip映射
节点 | IP映射名 |
---|---|
节点1 | server1 |
节点2 | server2 |
节点3 | server3 |
workers内容
server1
server2
启动Spark
在节点3输入
sbin/start-all.sh
在节点2输入
sbin/start-master.sh
测试检查
浏览器输入服务器Master节点IP,8080端口,查看Spark的WebUI如下
这里浏览器也做了IP映射
server2:8080
测试与Hive的集成
启动Spark sql之前别忘记打开hive的元数据服务,不知道怎么打开,启动hive也行
bin/spark-sql \
--master spark://server3:7077 \
--driver-class-path /opt/mysql-connector-java-5.1.49/mysql-connector-java-5.1.49-bin.jar
show databases;
show tables;
启动元数据服务的命令如下
hive --service metastore
注:启动之后别退出,另开一会话窗口就行了