Zeppelin安装部署
集群规划
序号 | IP | 主机别名 | 角色 | 集群 |
---|---|---|---|---|
1 | 192.168.137.110 | node1 | NameNode(Active),DFSZKFailoverController(ZKFC),ResourceManager,mysql,RunJar(Hive服务端-metastore),RunJar(Hive服务端-hiveserver2),ZeppelinServer | Hadoop |
2 | 192.168.137.111 | node2 | DataNode,JournalNode,QuorumPeerMain,NodeManager,RunJar(Hive客户端,启动时有) | Zookeeper,Hadoop |
3 | 192.168.137.112 | node3 | DataNode,JournalNode,QuorumPeerMain,NodeManager,RunJar(Hive客户端,启动时有) | Zookeeper,Hadoop |
4 | 192.168.137.113 | node4 | DataNode,JournalNode,QuorumPeerMain,NodeManager,RunJar(Hive客户端,启动时有) | Zookeeper,Hadoop |
5 | 192.168.137.114 | node5 | NameNode(Standby),DFSZKFailoverController(ZKFC),ResourceManager,JobHistoryServer,RunJar(Hive客户端,启动时有) | Hadoop |
软件版本信息
工具名称 | 说明 |
---|---|
VMware-workstation-full-15.5.1-15018445.exe | 虚拟机安装包 |
MobaXterm_Portable_v20.3.zip | 解压使用,远程连接Centos系统远程访问使用,支持登录和上传文件 |
CentOS-7-x86_64-DVD-1511.iso | Centos7系统ISO镜像,不需要解压,VMware安装时需要 |
jdk-8u171-linuxx64.tar.gz | jdk安装包,上传到Centos系统中使用 |
hadoop-2.7.3.tar.gz | hadoop的安装包,需要上传到虚拟机中 |
apache-flume-1.7.0-bin.tar.gz | 安装包,上传到Centos系统中使用 |
zeppelin-0.10.1-bin-all.tar.gz | 安装包,上传到Centos系统中使用 |
zookeeper-3.4.5.tar.gz | 安装包,上传到Centos系统中使用 |
apache-hive-2.3.9-bin.tar.gz | 安装包,上传到Centos系统中使用 |
Hadoop高可用部署方式参照:
https://blog.csdn.net/pblh123/article/details/126715861
Hive的安装方式参照:
https://blog.csdn.net/pblh123/article/details/126990820
zeppelin 安装部署
hadoop配置hdfs-site.xml
# hdfs-site.xml
vim hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
<description>datenode数,默认是3,应小于datanode机器数量</description>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
<description>如果是true则检查权限,否则不检查(每一个人都可以存取文件)</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/opt/soft_installed/hadoop-2.7.3/hadoopdatas/dfs/name</value>
<description>namenode上存储hdfs名字空间元数据</description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/opt/soft_installed/hadoop-2.7.3/hadoopdatas/dfs/data</value>
<description>datanode上数据块的物理存储位置</description>
</property>
<!--以下是HDFS HA的配置-->
<!--指定HDFS的nameservices名称为lh1,需要和core-site.xml中保持一致-->
<property>
<name>dfs.nameservices</name>
<value>lh1</value>
<description>hdfs的nameservice为lh1,需要和core-site.xml保持一直</description>
</property>
<property>
<name>dfs.ha.namenodes.lh1</name>
<value>nn1,nn2</value>
<description>lh1集群中两个namenode的名字</description>
</property>
<property>
<name>dfs.namenode.rpc-address.lh1.nn1</name>
<value>node1:9000</value>
<description>nn1的RPC通信地址</description>
</property>
<property>
<name>dfs.namenode.http-address.lh1.nn1</name>
<value>node1:50070</value>
<description>nn1的http通信地址</description>
</property>
<property>
<name>dfs.namenode.rpc-address.lh1.nn2</name>
<value>node5:9000</value>
<description>nn2的RPC通信地址</description>
</property>
<property>
<name>dfs.namenode.http-address.lh1.nn2</name>
<value>node5:50070</value>
<description>nn2的http通信地址</description>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://node2:8485;node3:8485;node4:8485/lh1</value>
<description>N指定NameNode的元数据在JournalNode上的存放位置</description>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/soft_installed/hadoop-2.7.3/hadoopdatas/journal</value>
<description>JournalNode上元数据和日志文件存放位置</description>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled.lh1</name>
<value>true</value>
<description>开启Namenode失败自动切换</description>
</property>
<property>
<name>dfs.client.failover.proxy.provider.lh1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
<description>配置失败时切换实现方式</description>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
<description>隔离机制,多个机制换行分割,每个机制一行</description>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_ras</value>
<description>sshfence隔离机制需要ssh免密登录</description>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
<description>sshfence隔离机制超时时间</description>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
<description>开启rest接口</description>
</property>
</configuration>
配置core-site.xml
[root@master hadoop]# vim core-site.xml
<configuration>
<!--fs.default.name,fs.defaultFS二选一 -->
<!--
<property>
<name>fs.default.name</name>
<value>hdfs://node1:9000</value>
<description>指定HDFS的,非HA下选择</description>
</property>
-->
<property>
<name>fs.defaultFS</name>
<value>hdfs://lh1</value>
<description>HDFS的URL,HA下配置</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/soft_installed/hadoop-2.7.3/tmp</value>
<description>节点上本地的hadoop临时文件夹</description>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>node2:2181,node3:2181,node4:2181</value>
<description>HDFA HA配置</description>
</property>
<!-- 表示任意节点使用root账号可访问hdfs -->
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
</configuration>
分发跟新后的配置,启动集群
# 分别分发到node2~5
# 分发到node2
scp core-site.xml node2:/opt/soft_installed/hadoop-2.7.3/etc/hadoop
scp hdfs-site.xml node2:/opt/soft_installed/hadoop-2.7.3/etc/hadoop
在node1(hive的服务端的机器)上,在hive安装目录的conf/hive-site.xml中加入以下配置:
```shell
<property>
<name>hive.server2.enable.doAs </name>
<value>false</value>
</property>
启动集群
/home/lh/scripts/HA_hadoop.sh start
启动hiveserver2
/home/lh/scripts/onekeyhive.sh start
zeppelin配置
```shell
tar -zxvf zeppelin-0.10.1-bin-all.tgz -C /opt/soft_installed/
# 修改配置文件 zeppelin-site.xml
cp zeppelin-site.xml.template zeppelin-site.xml
vim zeppelin-site.xml
<property>
<name>zeppelin.server.addr</name>
<value>node1</value>
<description>Server binding address</description>
</property>
<property>
<name>zeppelin.server.port</name>
<value>9980</value>
<description>Server port.</description>
</property>
# 修改配置文件 zeppelin-env.sh
vim zeppelin-env.sh
export JAVA_HOME=/opt/soft_installed/jdk1.8.0_171
export HADOOP_CONF_DIR=/opt/soft_installed/hadoop-2.7.3/etc/hadoop
# 将hive的配置文件拷贝到zeppelin的conf目录下
cp /opt/soft_installed/apache-hive-2.3.9-bin/conf/hive-site.xml /opt/soft_installed/zeppelin-0.10.1-bin-all/conf/
# 拷贝hive和Hadoop的相关jar包到zeppelin的jdbc目录下
cd /opt/soft_installed/zeppelin-0.10.1-bin-all/interpreter/jdbc/
cp /opt/soft_installed/hadoop-2.7.3/share/hadoop/common/hadoop-common-2.7.3.jar .
cp /opt/soft_installed/apache-hive-2.3.9-bin/lib/curator-client-2.7.1.jar .
cp /opt/soft_installed/apache-hive-2.3.9-bin/lib/guava-14.0.1.jar .
cp /opt/soft_installed/apache-hive-2.3.9-bin/lib/hive-jdbc-2.3.9.jar .
cp /opt/soft_installed/apache-hive-2.3.9-bin/lib/hive-common-2.3.9.jar .
cp /opt/soft_installed/apache-hive-2.3.9-bin/lib/hive-serde-2.3.9.jar .
cp /opt/soft_installed/apache-hive-2.3.9-bin/lib/hive-service-2.3.9.jar .
cp /opt/soft_installed/apache-hive-2.3.9-bin/lib/hive-service-rpc-2.3.9.jar .
cp /opt/soft_installed/apache-hive-2.3.9-bin/lib/libthrift-0.9.3.jar .
cp /opt/soft_installed/apache-hive-2.3.9-bin/lib/protobuf-java-2.5.0.jar .
cp /opt/soft_installed/hadoop-2.7.3/share/hadoop/common/lib/commons-lang-2.6.jar .
# 上传httpclient-4.5.13.jar,httpcore-4.4.15.jar到zeppline的/opt/soft_installed/zeppelin-0.10.1-bin-all/interpreter/jdbc/目录
# 启动zeppelin
[root@master soft_installed]# /opt/soft_installed/zeppelin-0.10.1-bin-all/bin/zeppelin-daemon.sh start
Log dir doesn't exist, create /opt/soft_installed/zeppelin-0.10.1-bin-all/logs
Pid dir doesn't exist, create /opt/soft_installed/zeppelin-0.10.1-bin-all/run
Zeppelin start [ OK ]
[root@master soft_installed]# jps
87953 Jps
4977 RunJar
3238 ResourceManager
4808 Ru
nJar
87864 ZeppelinServer
2828 NameNode
2590 DFSZKFailoverController
# stop
/opt/soft_installed/zeppelin-0.10.1-bin-all/bin/zeppelin-daemon.sh stop
web端验证zeppelin是否正常启动
配置zeppelin连接hive
zepplion 一键启动
[root@master scripts]# cat onekeyzeppelin.sh
#! /bin/bash
# spark集群
case $1 in
"start")
echo "========== now start zeppelin =========="
/opt/soft_installed/zeppelin-0.10.1-bin-all/bin/zeppelin-daemon.sh start;;
"stop")
echo "========== now stop zeppelin =========="
/opt/soft_installed/zeppelin-0.10.1-bin-all/bin/zeppelin-daemon.sh stop;;
*)
echo Invalid Args!
echo 'Usage: '$(basename $0)' start|stop';;
esac