1.先下载制作好的自定义镜像
docker pull linfer/hbase:2.5.4
具体镜像的Dockerfile文件内容如下【如果嫌麻烦可以按照步骤1直接下载使用】
但是要注意的是我使用的环境是MAC M1,所以Dockerfile的jdk版本是arm版本,如果你的环境是x64环境请将Dockerfile文件中的java-8-openjdk-arm64改为amd64即可(需要手动制作镜像)
如果想做自定义镜像部署的话可以使用docker命令制作
先进入包含Dockerfile的目录下执行(linfer为用户名可以完全自定义无需跟此帖一致)
注意命令结尾的符号 . “表示当前路径下”
docker build -t linfer/hbase:2.5.4 .
FROM ubuntu:22.04
ENV HADOOP_HOME /opt/hadoop
ENV HBASE_HOME /opt/hbase
ENV JAVA_HOME /usr/lib/jvm/java-8-openjdk-arm64
USER root
RUN apt-get update \
&& apt install -y openjdk-8-jdk \
&& apt install -y openssh-server openssh-client \
&& ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa \
&& cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys \
&& chmod 0600 ~/.ssh/authorized_keys \
&& mkdir -p /data/hdfs \
&& mkdir -p /data/hdfs/journal/node/local/data
RUN service ssh start
EXPOSE 9870
EXPOSE 9868
EXPOSE 9864
EXPOSE 9866
EXPOSE 8088
EXPOSE 8020
EXPOSE 16000
EXPOSE 16010
EXPOSE 16020
EXPOSE 22
CMD /usr/sbin/sshd -D
在做上述操作前,请先做好这几个准备工作:
1.准备好本地的hadoop文件[注意版本最好一致]
2.准备好本地的hbase文件[注意版本最好一致]
3.mac环境的同学请设置好docker的共享目录[只有共享目录后docker才能把文件读取在镜像中]
做好以上准备工作后,先来部署zookeeper环境,因为对于zk来说,docker官方的镜像做的已经很不错了,所以无需部署自定义镜像,直接拉取下来就好
docker pull zookeeper
创建zookeeper集群,使用docker-compose.yml文件,通过docker-compose up -d命令
通过下面文件内容在启动zk集群的同时也会创建一个叫做zookeeper-cluster的网卡,后续hbase和hadoop集群部署都需要指定这个网卡以方便连接到zk上面存储集群的元数据
version: '3.1'
services:
zoo1:
image: zookeeper:3.7.1-temurin
container_name: zoo1
restart: always
hostname: zoo1
ports:
- 2181:2181
environment:
ZOO_MY_ID: 1
ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=zoo3:2888:3888;2181
networks:
zookeeper-cluster:
ipv4_address: 10.10.1.10
zoo2:
image: zookeeper:3.7.1-temurin
container_name: zoo2
restart: always
hostname: zoo2
ports:
- 2182:2181
environment:
ZOO_MY_ID: 2
ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=zoo3:2888:3888;2181
networks:
zookeeper-cluster:
ipv4_address: 10.10.1.11
zoo3:
image: zookeeper:3.7.1-temurin
container_name: zoo3
restart: always
hostname: zoo3
ports:
- 2183:2181
environment:
ZOO_MY_ID: 3
ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=zoo3:2888:3888;2181
networks:
zookeeper-cluster:
ipv4_address: 10.10.1.12
networks:
zookeeper-cluster:
name: zookeeper-cluster
ipam:
config:
- subnet: "10.10.1.0/24"
启动成功后可以在docker可视化界面看到zk的集群启动情况
接下来就可以正常构建hbase+hadoop的镜像了
强调:构建之前要保证刚才所说的hadoop和hbase的解压后的文件都要在你电脑本地的/opt/目录下已经存在了,因为一会启用集群的时候需要指定容器的路径和本地环境路径做共享目录,最终可以实现本地环境修改文件对应容器中的环境可以同步修改。
接下来启用集群配置文件docker-compose.yml
docker-compose up -d
docker-compose.yml文件内容如下:
version: '3'
services:
hadoop-master1:
image: linfer/hbase:2.5.4
container_name: hadoop-master1
hostname: hadoop-master1
stdin_open: true
tty: true
command:
- sh
- -c
- |
/usr/sbin/sshd -D
volumes:
- type: bind
source: /opt/hadoop-3.2.4
target: /opt/hadoop
- type: bind
source: /opt/hbase-2.5.4
target: /opt/hbase
ports:
- "8020:8020"
- "8042:8042"
- "9870:9870"
- "8088:8088"
- "8032:8032"
- "10020:10020"
- "16000:16000"
- "16010:16010"
networks:
zookeeper-cluster:
ipv4_address: 10.10.1.20
hadoop-master2:
image: linfer/hbase:2.5.4
container_name: hadoop-master2
hostname: hadoop-master2
stdin_open: true
tty: true
command:
- sh
- -c
- |
/usr/sbin/sshd -D
volumes:
- type: bind
source: /opt/hadoop-3.2.4
target: /opt/hadoop
- type: bind
source: /opt/hbase-2.5.4
target: /opt/hbase
ports:
- "28020:8020"
- "18042:8042"
- "29870:9870"
- "28088:8088"
- "28032:8032"
- "20020:10020"
networks:
zookeeper-cluster:
ipv4_address: 10.10.1.21
hadoop-master3:
image: linfer/hbase:2.5.4
container_name: hadoop-master3
hostname: hadoop-master3
stdin_open: true
tty: true
command:
- sh
- -c
- |
/usr/sbin/sshd -D
volumes:
- type: bind
source: /opt/hadoop-3.2.4
target: /opt/hadoop
- type: bind
source: /opt/hbase-2.5.4
target: /opt/hbase
ports:
- "38020:8020"
- "28042:8042"
- "39870:9870"
- "38088:8088"
- "38032:8032"
- "30020:10020"
networks:
zookeeper-cluster:
ipv4_address: 10.10.1.22
hadoop-worker1:
image: linfer/hbase:2.5.4
container_name: hadoop-worker1
hostname: hadoop-worker1
stdin_open: true
tty: true
command:
- sh
- -c
- |
/usr/sbin/sshd -D
volumes:
- type: bind
source: /opt/hadoop-3.2.4
target: /opt/hadoop
- type: bind
source: /opt/hbase-2.5.4
target: /opt/hbase
ports:
- "9867:9867"
- "38042:8042"
- "9866:9866"
- "9865:9865"
- "9864:9864"
networks:
zookeeper-cluster:
ipv4_address: 10.10.1.23
hadoop-worker2:
image: linfer/hbase:2.5.4
container_name: hadoop-worker2
hostname: hadoop-worker2
stdin_open: true
tty: true
command:
- sh
- -c
- |
/usr/sbin/sshd -D
volumes:
- type: bind
source: /opt/hadoop-3.2.4
target: /opt/hadoop
- type: bind
source: /opt/hbase-2.5.4
target: /opt/hbase
ports:
- "29867:9867"
- "48042:8042"
- "29866:9866"
- "29865:9865"
- "29864:9864"
networks:
zookeeper-cluster:
ipv4_address: 10.10.1.24
hadoop-worker3:
image: linfer/hbase:2.5.4
container_name: hadoop-worker3
hostname: hadoop-worker3
stdin_open: true
tty: true
command:
- sh
- -c
- |
/usr/sbin/sshd -D
volumes:
- type: bind
source: /opt/hadoop-3.2.4
target: /opt/hadoop
- type: bind
source: /opt/hbase-2.5.4
target: /opt/hbase
ports:
- "39867:9867"
- "58042:8042"
- "39866:9866"
- "39865:9865"
- "39864:9864"
networks:
zookeeper-cluster:
ipv4_address: 10.10.1.25
networks:
zookeeper-cluster:
external: true
启动成功后修改hadoop的配置文件,在本地环境修改即可,因为之前提到了,我们做了共享目录,所以只需要在本地修改即可
a.修改hadoop的配置文件[总共修改5个文件]
配置文件内容比较多,我直接上传云盘,大家可以直接下载使用,将文件内容替换即可
链接: https://pan.baidu.com/s/1w0_vFeO8PDsmNla1OzcVRQ?pwd=linf 提取码: linf
修改配置文件内容后,使用如下脚本进行hadoop hdfs初始化:
需要的话也可以网盘自取[包括第一次启动初始化脚本setup.sh以及重启后初始化脚本restart.sh]:
链接: https://pan.baidu.com/s/1dJWZt8hebtDIEimfdui-Fg?pwd=linf 提取码: linf
使用前别忘记添加可执行权限,然后./setup.sh运行
chmod a+x setup.sh
docker exec hadoop-master1 ssh -o StrictHostKeyChecking=no hadoop-master2
docker exec hadoop-master1 ssh -o StrictHostKeyChecking=no hadoop-master3
docker exec hadoop-master2 ssh -o StrictHostKeyChecking=no hadoop-master1
docker exec hadoop-master2 ssh -o StrictHostKeyChecking=no hadoop-master3
docker exec hadoop-master3 ssh -o StrictHostKeyChecking=no hadoop-master1
docker exec hadoop-master3 ssh -o StrictHostKeyChecking=no hadoop-master2
echo "starting all journalnode..."
docker exec hadoop-master1 /opt/hadoop/bin/hdfs --daemon start journalnode
docker exec hadoop-master2 /opt/hadoop/bin/hdfs --daemon start journalnode
docker exec hadoop-master3 /opt/hadoop/bin/hdfs --daemon start journalnode
docker exec hadoop-worker1 /opt/hadoop/bin/hdfs --daemon start journalnode
docker exec hadoop-worker2 /opt/hadoop/bin/hdfs --daemon start journalnode
docker exec hadoop-worker3 /opt/hadoop/bin/hdfs --daemon start journalnode
docker exec hadoop-master1 /opt/hadoop/bin/hdfs namenode -format
docker exec hadoop-master1 /opt/hadoop/bin/hdfs --daemon start namenode
docker exec hadoop-master2 /opt/hadoop/bin/hdfs namenode -bootstrapStandby
docker exec hadoop-master2 /opt/hadoop/bin/hdfs --daemon start namenode
docker exec hadoop-master3 /opt/hadoop/bin/hdfs namenode -bootstrapStandby
docker exec hadoop-master3 /opt/hadoop/bin/hdfs --daemon start namenode
docker exec hadoop-master1 /opt/hadoop/sbin/stop-dfs.sh
# 如果zookeeper数据被删除,请执行下面这句话,正常情况请注释
#docker exec hadoop-master1 /opt/hadoop/bin/hdfs zkfc -formatZK
echo "starting zkfc..."
docker exec hadoop-master1 /opt/hadoop/bin/hdfs --daemon start zkfc
echo "starting dfs..."
docker exec hadoop-master1 /opt/hadoop/sbin/start-dfs.sh
echo "starting yarn..."
docker exec hadoop-master1 /opt/hadoop/sbin/start-yarn.sh
echo "Done!"
运行脚本后如没有验证报错信息,浏览器输入localhost:9870或localhost:19870/localhost:29870可以看到hdfs的页面信息
再输入localhost:8088、localhost:18088、localhost:28088任意一个地址可以看到yarn的监控页面
如果这两个页面都可以正常显示且状态正常说明部署的没有问题,可以进行后续步骤
b.修改hbase的配置文件:
1. 修改/opt/hbase/conf下的hbase-env.sh
去掉export JAVA_HOME这一行前面的注释,将其改为正确的jdk值,改好的如下
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-arm64 #非苹果的需要arm改成amd
export HBASE_MANAGES_ZK=false #确保前面的#被删掉,确保后面的值是false
export HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP="true" #确保前面的#被删掉,确保后面的值是true
2. 修改hbase-site.xml文件,内容如下,直接替换原有配置文件的内容即可
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop-master1:8020/hbase</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>./tmp</value>
</property>
<property>
<name>hbase.unsafe.stream.capability.enforce</name>
<value>false</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>zoo1,zoo2,zoo3</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/usr/local/zookeeper</value>
</property>
3. 修改regionservers为
hadoop-master2
hadoop-master3
4. 建立叫做backup-masters的文件 #这个文件没有,需要自己建立
hadoop-master2
5. 进入hadoop-master1
docker exec -it hadoop-master1 bash
6. 启动hbase
/opt/hbase/bin/start-hbase.sh
结束后,用jps命令可以看到HMaster进程
7. 登录到Hbase数据库
/opt/hbase/bin/hbase shell
登录后输入status命令,查看hbase集群的运行状态,可以得到如下结果
1 active master, 1 backup masters, 2 servers, 0 dead, 0.0000 average load
如果与截图中显示一致,说明hbase 的环境已经搭建完成
文章结尾给大家分享一个关于hbase的学习文档,可以用作测试使用
链接: https://pan.baidu.com/s/1uUO94ii769h3nnVI6JuQzg?pwd=linf 提取码: linf
祝大家顺利部署学习,有问题请留言,大家一起探讨~