文章目录
4.1 Hadoop分布式文件系统_伪分布式 完全分布式 集群搭建 热添加
5. 分布式计算
工作原理:*******(只是一个简单的调度)
管理器方便作业调度,资源调度
RM:只是做单纯的资源调度和管理,不需要做任何的负载
NM:YARN运行的位置,每个NM管理自己上面的资源,会将本机的状态持续的发送给RM,这样RM就知道每个NM上都有哪些资源
AM:获取整个YARN的进度,不需要用户再去访问RM。YARN在运行时,会发送心跳信息给AM,汇报进度
- 编辑mapreduce的配置文件
[hadoop@server21 ~]$ cd hadoop/etc/hadoop/
[hadoop@server21 hadoop]$ vim mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
</property>
</configuration>
- 编辑hadoop启动脚本
[hadoop@server21 hadoop]$ vim hadoop-env.sh
export HADOOP_HOME=/home/hadoop/hadoop
export HADOOP_MAPRED_HOME=/home/hadoop/hadoop
- 编辑管理器的配置文件
[hadoop@server21 hadoop]$ vim yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCATHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
</configuration>
- 启动管理器,NameNode会启动一个ResourceManager进程(资源管理器)
[hadoop@server21 hadoop]$ pwd
/home/hadoop/hadoop
[hadoop@server21 hadoop]$ sbin/start-yarn.sh
Starting resourcemanager
Starting nodemanagers
server24: Warning: Permanently added 'server24,172.25.21.24' (ECDSA) to the list of known hosts.
[hadoop@server21 hadoop]$ jps
17752 SecondaryNameNode
17562 NameNode
19101 Jps
18974 ResourceManager
- DataNode会启动一个节点管理器
[hadoop@server22 ~]$ jps
15536 Jps
15442 NodeManager
4698 DataNode
6. NN名字节点(hdfs)的高可用
6.1 准备工作:添加一个新的DataNode
JN:日志节点,
生产环境建议将NN和RM分离,NN本身就是一个资源管理器(占用很多的资源),RM负责资源调度,消耗的资源也很多。
如果二者在一起运行,有可能会发生争抢资源的情况
本实验将NN和RM合并
server2~4:最小化的集群,日志节点、zk节点、DN、NM管理器的NM节点合并在一起
server1和server5做高可用的master和backup:故障切换控制器、RM、NN
(注意,因为宿主主机只有8G的内存,所以,在添加新的虚拟机时,需要合理分配内存)
NameNode ——> 2G
DataNode ——> 1G
- 写本地解析文件
[root@server25 ~]# vim /etc/hosts
- 创建普通用户
[root@server25 ~]# useradd hadoop
- 安装NFS
[root@server25 ~]# yum install -y nfs-utils
- 挂载NameNode的共享目录
/home/hadoop
[root@server25 ~]# mount 172.25.21.21:/home/hadoop /home/hadoop
[root@server25 ~]# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/rhel-root 17811456 1164528 16646928 7% /
devtmpfs 1011448 0 1011448 0% /dev
tmpfs 1023468 0 1023468 0% /dev/shm
tmpfs 1023468 16996 1006472 2% /run
tmpfs 1023468 0 1023468 0% /sys/fs/cgroup
/dev/vda1 1038336 135076 903260 14% /boot
tmpfs 204696 0 204696 0% /run/user/0
172.25.21.21:/home/hadoop 17811456 3206912 14604544 19% /home/hadoop
[root@server25 ~]# su - hadoop
6.2 准备工作:清理操作
- 因为刚刚做了分布式计算的实验,所以,在做高可用实验时,需要清理一次数据
[hadoop@server21 hadoop]$ pwd
/home/hadoop/hadoop
[hadoop@server21 hadoop]$ sbin/stop-yarn.sh
Stopping nodemanagers
server22: WARNING: nodemanager did not stop gracefully after 5 seconds: Trying to kill with kill -9
server24: WARNING: nodemanager did not stop gracefully after 5 seconds: Trying to kill with kill -9
server23: WARNING: nodemanager did not stop gracefully after 5 seconds: Trying to kill with kill -9
Stopping resourcemanager
[hadoop@server21 hadoop]$ jps
19525 Jps
17752 SecondaryNameNode
17562 NameNode
DataNode的进程情况如下
[hadoop@server22 ~]$ jps
15536 Jps
15442 NodeManager
4698 DataNode
[hadoop@server22 ~]$ jps
15695 Jps
4698 DataNode
- 关闭Hadoop
[hadoop@server21 hadoop]$ sbin/stop-dfs.sh
Stopping namenodes on [server21]
Stopping datanodes
Stopping secondary namenodes [server21]
[hadoop@server21 hadoop]$ jps
19949 Jps
DataNode的进程情况如下
[hadoop@server22 ~]$ jps
15695 Jps
[hadoop@server23 ~]$ jps
15414 Jps
[hadoop@server24 hadoop]$ jps
15656 Jps
- 因为做了之前的实验(完全分布式和分布式计算),Hadoop会在
/tmp
目录下生成很多的临时数据,所以,在做高可用实验时,我们需要清空之前的数据
(NameNode和DataNode都需要进行清理操作)
[hadoop@server21 hadoop]$ rm -fr /tmp/*
[hadoop@server22 ~]$ rm -fr /tmp/*
[hadoop@server23 ~]$ rm -fr /tmp/*
[hadoop@server24 hadoop]$ rm -fr /tmp/*
6.3 准备工作:安装zookeeper,搭建ZK集群
搭建的ZK集群至少3台
注意:集群通常是奇数的,不要使用偶数个
- 安装zookeeper
[hadoop@server21 ~]$ tar zxf zookeeper-3.4.9.tar.gz
- 拷贝生成zookeeper的主配置文件
[hadoop@server21 ~]$ cd zookeeper-3.4.9/conf/
[hadoop@server21 conf]$ ls
configuration.xsl log4j.properties zoo_sample.cfg
[hadoop@server21 conf]$ cp zoo_sample.cfg zoo.cfg
- 修改配置文件:设定3个集群节点
写入DataNode对应的IP
DataNode1
的IP是172.25.21.22
这里的server1不是域名是server1,是服务器编号
服务器1的IP是172.25.21.22
- 2888:用于集群节点之间的同步和其他通信(通信端口)
- 3888:用于集群节点之间leader的选举(选举端口)
设定这2个端口,是为了选出一个节点作为leader,其余的2个节点作为follower
[hadoop@server21 conf]$ vim zoo.cfg
server.1=172.25.21.22:2888:3888
server.2=172.25.21.23:2888:3888
server.3=172.25.21.24:2888:3888
- 在DataNode(server22)上创建一个数据目录
同时还需要在这个数据目录里面创建一个文件myid,将节点ID号写入到临时目录zookeeper/myid
中
(后面的server23,server24进行同样的操作)
[hadoop@server22 ~]$ mkdir /tmp/zookeeper
[hadoop@server22 ~]$ echo 1 > /tmp/zookeeper/myid
[hadoop@server23 ~]$ mkdir /tmp/zookeeper
[hadoop@server23 ~]$ echo 2 > /tmp/zookeeper/myid
[hadoop@server24 hadoop]$ mkdir /tmp/zookeeper
[hadoop@server24 hadoop]$ echo 3 > /tmp/zookeeper/myid
- server22启动zookeeper,并查看该节点的身份(启动3个节点)
根据提示,选举server22为follower
[hadoop@server22 ~]$ cd zookeeper-3.4.9/
[hadoop@server22 zookeeper-3.4.9]$ bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@server22 ~]$ cd zookeeper-3.4.9/
[hadoop@server22 zookeeper-3.4.9]$ bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Mode: follower
- server23启动zookeeper,并查看该节点的身份(启动3个节点)
根据提示,选举server23是leader
[hadoop@server23 ~]$ cd zookeeper-3.4.9/
[hadoop@server23 zookeeper-3.4.9]$ bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@server23 ~]$ cd zookeeper-3.4.9/
[hadoop@server23 zookeeper-3.4.9]$ bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Mode: leader
- server23启动zookeeper(启动3个节点)
[hadoop@server24 ~]$ cd zookeeper-3.4.9/
[hadoop@server24 zookeeper-3.4.9]$ bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
- 命令jps查看当前的进程
[hadoop@server22 zookeeper-3.4.9]$ jps
15768 Jps
15741 QuorumPeerMain
[hadoop@server23 zookeeper-3.4.9]$ jps
15452 QuorumPeerMain
15485 Jps
6.4 分布式集群
- 编辑核心配置文件(修改2处内容)
(1)为了完成高可用的配置(server1挂了之后,server5进行接管),在核心配置文件中NN地址不能固定设置,应该写masters
(注意,这里不支持指定端口,因为在hdfs的配置文件中,会单独的指定)
(注意,这里到底server1是master,还是server5是master,取决于二者谁写入数据 )
(2)还要告诉masters,ZK集群的位置在哪里
[hadoop@server21 ~]$ cd hadoop/etc/hadoop/
[hadoop@server21 hadoop]$ vim core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://masters</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>172.25.21.22:2181,172.25.21.23:2181,172.25.21.24:2181</value>
</property>
</configuration>
- 编辑hdfs分布式文件系统
(1)设置副本数为3
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
(2)指定HDFS的masterservices为masters,必须和核心配置文件中的设置保持一致
<property>
<name>dfs.nameservices</name>
<value>masters</value>
</property>
(3)定义master两台机器的名字,这里我定义的是h1
和h2
(masters下面有2个节点)
<property>
<name>dfs.ha.namenodes.masters</name>
<value>h1,h2</value>
</property>
(4)告诉HDFS,h1是谁,它的通信端口rpc是多少
<property>
<name>dfs.namenode.rpc-address.masters.h1</name>
<value>172.25.21.21:9000</value>
</property>
(5)告诉HDFS,h1是谁,它的通信地址http是多少(修改成9870)
<property>
<name>dfs.namenode.http-address.masters.h1</name>
<value>172.25.21.21:50070</value>
</property>
(6)告诉HDFS,h2是谁,它的通信端口rpc是多少
<property>
<name>dfs.namenode.rpc-address.masters.h2</name>
<value>172.25.21.25:9000</value>
</property>
(7)告诉HDFS,h2是谁,它的通信地址http是多少(修改成9870)
<property>
<name>dfs.namenode.http-address.masters.h2</name>
<value>172.25.21.25:50070</value>
</property>
(8)设定日志节点(本实验,日志节点和ZK节点在一起)
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://172.25.21.22:8485;172.25.21.23:8485;172.25.21.24:8485/masters</value>
</property>
(9)日志节点在本地磁盘中存储数据的路径,也在/tmp
中
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/tmp/journaldata</value>
</property>
(10)开启NameNode失败自动切换
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
(11)失败的自动切换方式
<property>
<name>dfs.client.failover.proxy.provider.masters</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
(12)隔离机制(每个隔离机制占用一行)
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property>
(13)免密
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
(14)隔离机制的超时时间
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
汇总
[hadoop@server21 hadoop]$ vim hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>masters</value>
</property>
<property>
<name>dfs.ha.namenodes.masters</name>
<value>h1,h2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.masters.h1</name>
<value>172.25.21.21:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.masters.h1</name>
<value>172.25.21.21:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.masters.h2</name>
<value>172.25.21.25:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.masters.h2</name>
<value>172.25.21.25:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://172.25.21.22:8485;172.25.21.23:8485;172.25.21.24:8485/masters</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/tmp/journaldata</value></property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.masters</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
</configuration>
- 启动日志节点
注意,这里需要按照顺序启动
先是ZK集群(前面已经启动了)
再启动日志节点(如果第一次启动hdfs,那么必须先启动日志节点journalnode)
[hadoop@server22 ~]$ cd hadoop/
[hadoop@server22 hadoop]$ bin/hdfs --daemon start journalnode
[hadoop@server23 ~]$ cd hadoop
[hadoop@server23 hadoop]$ bin/hdfs --daemon start journalnode
[hadoop@server24 ~]$ cd hadoop
[hadoop@server24 hadoop]$ bin/hdfs --daemon start journalnode
[hadoop@server24 hadoop]$ jps
15781 JournalNode
15799 Jps
15704 QuorumPeerMain
- 再启动hdfs集群
(1)先格式化(注意,上面启动日志节点后,连接有些慢,这样会导致格式化出现差错,多格式化几次就好。但也要注意,如果格式化次数不合适,会导致NameNode上面的ID号和DataNode记录的NameNode号不一致)
[hadoop@server21 hadoop]$ pwd
/home/hadoop/hadoop
[hadoop@server21 hadoop]$ bin/hdfs namenode -format
(2)将server1的数据拷贝到server5上
[hadoop@server21 hadoop]$ scp -r /tmp/hadoop-hadoop 172.25.21.25:/tmp
The authenticity of host '172.25.21.25 (172.25.21.25)' can't be established.
ECDSA key fingerprint is SHA256:LVWn1150KPf3FC+madPS/iM7eKo3b88fRxvOHhGk7fM.
ECDSA key fingerprint is MD5:dd:77:bf:39:3f:b3:db:db:e4:9c:6b:27:a0:4e:97:63.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '172.25.21.25' (ECDSA) to the list of known hosts.
VERSION 100% 216 128.9KB/s 00:00
seen_txid 100% 2 3.2KB/s 00:00
fsimage_0000000000000000000.md5 100% 62 98.1KB/s 00:00
fsimage_0000000000000000000 100% 401 567.8KB/s 00:00
[hadoop@server21 hadoop]$ ssh server25
(3)格式化ZK
zkfc故障控制器
[hadoop@server21 hadoop]$ bin/hdfs zkfc -formatZK
(4)server23是leader
通过该脚本,可以查看到目前是谁(server1还是server5)写入,作为当前的maseter
[hadoop@server23 zookeeper-3.4.9]$ bin/zkCli.sh
(5)启动hdfs
注意,一旦启动高可用,之前的SecondNameNode就会退化
[hadoop@server21 hadoop]$ sbin/start-dfs.sh
Starting namenodes on [server21 server25]
Starting datanodes
Starting journal nodes [server22 server24 server23]
server22: journalnode is running as process 15833. Stop it first.
server24: journalnode is running as process 15781. Stop it first.
server23: journalnode is running as process 15546. Stop it first.
Starting ZK Failover Controllers on NN hosts [server21 server25]
[hadoop@server21 hadoop]$ jps
20690 DFSZKFailoverController
20743 Jps
20334 NameNode
注意,一旦启动高可用,之前的SecondNameNode就会退化
[hadoop@server25 ~]$ jps
4742 NameNode
4809 DFSZKFailoverController
4894 Jps
(现在server21是master)
现在server21是active,而server25是standby
故障模拟
- 先上传一些文件,方便模拟故障
[hadoop@server21 hadoop]$ pwd
/home/hadoop/hadoop
[hadoop@server21 hadoop]$ bin/hdfs dfs -mkdir /user
[hadoop@server21 hadoop]$ bin/hdfs dfs -mkdir /user/hadoop
[hadoop@server21 hadoop]$ bin/hdfs dfs -put demo
2021-04-24 17:14:08,903 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-04-24 17:14:13,543 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
- kill了NameNode之后,依旧可以访问到之前上传的数据,因为server5已经接管
[hadoop@server21 hadoop]$ jps
20690 DFSZKFailoverController
21237 Jps
20334 NameNode
[hadoop@server21 hadoop]$ kill 20334
[hadoop@server21 hadoop]$ jps
20690 DFSZKFailoverController
21256 Jps
[hadoop@server21 hadoop]$ bin/hdfs dfs -ls
Found 1 items
-rw-r--r-- 3 hadoop supergroup 209715200 2021-04-24 17:14 demo
[hadoop@server25 ~]$ jps
15684 Jps
4742 NameNode
4809 DFSZKFailoverController
- 在ZK集群的leader中可以查看到当前的master是server5
- 恢复server21
[hadoop@server21 hadoop]$ bin/hdfs --daemon start namenode
[hadoop@server21 hadoop]$ jps
20690 DFSZKFailoverController
21429 Jps
21383 NameNode
7. RM资源管理器(yarn)的高可用
- 编辑yarn-site.xml
(1)激活RM的高可用
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
(2)定义集群的名字
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>RM_CLUSTER</value>
</property>
(3)定义RM的节点
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
(4)指定RM1的地址
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>172.25.21.21</value>
</property>
(5)指定RM1的地
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>172.25.21.25</value>
</property>
(6)激活RM自动回复功能
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
(7)配置RM状态存储信息方式(内存存储,ZK存储)
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
汇总
[hadoop@server21 hadoop]$ pwd
/home/hadoop/hadoop/etc/hadoop
[hadoop@server21 hadoop]$ vim yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCATHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>RM_CLUSTER</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>172.25.21.21</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>172.25.21.25</value>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>172.25.21.22:2181,172.25.21.23:2181,172.25.21.24:2181</value>
</property>
</configuration>
- 启动yarn服务;
server21和server25上启动了ResourceManager;
其余的server22~24都启动了NodeManager
[hadoop@server21 hadoop]$ sbin/start-yarn.sh
Starting resourcemanagers on [ 172.25.21.21 172.25.21.25]
Starting nodemanagers
[hadoop@server21 hadoop]$ pwd
/home/hadoop/hadoop
[hadoop@server21 hadoop]$ jps
20690 DFSZKFailoverController
21859 ResourceManager
22180 Jps
21383 NameNode
[hadoop@server24 hadoop]$ jps
16225 NodeManager
16321 Jps
15781 JournalNode
15893 DataNode
15704 QuorumPeerMain
- 通过server23的脚本,可以知道当前m1(server21是master)
模拟故障
- 杀死RM,server25接管
[hadoop@server21 hadoop]$ jps
20690 DFSZKFailoverController
21859 ResourceManager
22213 Jps
21383 NameNode
[hadoop@server21 hadoop]$ kill 21859
[hadoop@server21 hadoop]$ jps
20690 DFSZKFailoverController
21383 NameNode
22231 Jps
- 恢复RM,server21的状态是standby
[hadoop@server21 hadoop]$ pwd
/home/hadoop/hadoop
[hadoop@server21 hadoop]$ bin/yarn --daemon start resourcemanager
[hadoop@server21 hadoop]$ jps
20690 DFSZKFailoverController
22356 Jps
22310 ResourceManager
21383 NameNode
8. HBase分布式部署
- 安装
[hadoop@server21 ~]$ tar zxf hbase-1.2.4-bin.tar.gz
- 修改主配置文件
[hadoop@server21 ~]$ cd hbase-1.2.4/
[hadoop@server21 hbase-1.2.4]$ cd conf/
[hadoop@server21 conf]$ vim hbase-env.sh
export JAVA_HOME=/home/hadoop/java
export HADOOP_HOME=/home/hadoop/hadoop
export HBASE_MANAGES_ZK=false
- 编辑hbase-site.xml
[hadoop@server21 conf]$ vim hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://masters/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>172.25.21.22,172.25.21.23,172.25.21.24</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>hbase.master</name>
<value>h1</value>
</property>
</configuration>
- 写入节点
[hadoop@server21 conf]$ vim regionservers
server23
server24
server22
- server25启动HBase,作为主master
[hadoop@server25 hbase-1.2.4]$ pwd
/home/hadoop/hbase-1.2.4
[hadoop@server25 hbase-1.2.4]$ bin/start-hbase.sh
starting master, logging to /home/hadoop/hbase-1.2.4/bin/../logs/hbase-hadoop-master-server25.out
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
server24: starting regionserver, logging to /home/hadoop/hbase-1.2.4/bin/../logs/hbase-hadoop-regionserver-server24.out
server23: starting regionserver, logging to /home/hadoop/hbase-1.2.4/bin/../logs/hbase-hadoop-regionserver-server23.out
server22: starting regionserver, logging to /home/hadoop/hbase-1.2.4/bin/../logs/hbase-hadoop-regionserver-server22.out
[hadoop@server25 hbase-1.2.4]$ jps
10135 DFSZKFailoverController
12013 HMaster
12302 Jps
10527 NameNode
11519 ResourceManager
- server21作为备用master
[hadoop@server25 hbase-1.2.4]$ bin/hbase-daemon.sh start master
starting master, logging to /home/hadoop/hbase-1.2.4/bin/../logs/hbase-hadoop-master-server25.out
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
[hadoop@server25 hbase-1.2.4]$ jps
16256 Jps
14663 NameNode
14779 DFSZKFailoverController
16141 HMaster
15023 ResourceManager
模拟故障
- 杀死HMaster,server25挂了,server21接管
[hadoop@server25 hbase-1.2.4]$ jps
14663 NameNode
14779 DFSZKFailoverController
16427 Jps
16141 HMaster
15023 ResourceManager
[hadoop@server25 hbase-1.2.4]$ kill 16141
- 故障恢复,server25作为备用master
[hadoop@server25 hbase-1.2.4]$ bin/start-hbase.sh
starting master, logging to /home/hadoop/hbase-1.2.4/bin/../logs/hbase-hadoop-master-server25.out
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
server23: regionserver running as process 16297. Stop it first.
server24: regionserver running as process 16611. Stop it first.
server22: regionserver running as process 5558. Stop it first.
[hadoop@server25 hbase-1.2.4]$ jps
14663 NameNode
16904 Jps
14779 DFSZKFailoverController
16750 HMaster
15023 ResourceManager