[Doker+HBASE+HADOOP+Zookeeper]全分布式环境搭建

1.先下载制作好的自定义镜像

docker pull linfer/hbase:2.5.4

具体镜像的Dockerfile文件内容如下【如果嫌麻烦可以按照步骤1直接下载使用】

但是要注意的是我使用的环境是MAC M1,所以Dockerfile的jdk版本是arm版本,如果你的环境是x64环境请将Dockerfile文件中的java-8-openjdk-arm64改为amd64即可(需要手动制作镜像)

如果想做自定义镜像部署的话可以使用docker命令制作

先进入包含Dockerfile的目录下执行(linfer为用户名可以完全自定义无需跟此帖一致)

注意命令结尾的符号 .  “表示当前路径下”

docker build -t linfer/hbase:2.5.4 .

FROM ubuntu:22.04

ENV HADOOP_HOME /opt/hadoop
ENV HBASE_HOME /opt/hbase
ENV JAVA_HOME /usr/lib/jvm/java-8-openjdk-arm64

USER root

RUN apt-get update \
&& apt install -y openjdk-8-jdk \
&& apt install -y openssh-server openssh-client \
&& ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa \
&& cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys \
&& chmod 0600 ~/.ssh/authorized_keys \
&& mkdir -p /data/hdfs \
&& mkdir -p /data/hdfs/journal/node/local/data  

RUN service ssh start

EXPOSE 9870
EXPOSE 9868
EXPOSE 9864
EXPOSE 9866
EXPOSE 8088
EXPOSE 8020
EXPOSE 16000
EXPOSE 16010
EXPOSE 16020
EXPOSE 22


CMD /usr/sbin/sshd -D

在做上述操作前,请先做好这几个准备工作:

1.准备好本地的hadoop文件[注意版本最好一致]

2.准备好本地的hbase文件[注意版本最好一致]

3.mac环境的同学请设置好docker的共享目录[只有共享目录后docker才能把文件读取在镜像中]

 做好以上准备工作后,先来部署zookeeper环境,因为对于zk来说,docker官方的镜像做的已经很不错了,所以无需部署自定义镜像,直接拉取下来就好

docker pull zookeeper

创建zookeeper集群,使用docker-compose.yml文件,通过docker-compose up -d命令

 

通过下面文件内容在启动zk集群的同时也会创建一个叫做zookeeper-cluster的网卡,后续hbase和hadoop集群部署都需要指定这个网卡以方便连接到zk上面存储集群的元数据

version: '3.1'

services:
  zoo1:
    image: zookeeper:3.7.1-temurin
    container_name: zoo1
    restart: always
    hostname: zoo1
    ports:
      - 2181:2181
    environment:
      ZOO_MY_ID: 1
      ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=zoo3:2888:3888;2181
    networks:
      zookeeper-cluster:
        ipv4_address: 10.10.1.10

  zoo2:
    image: zookeeper:3.7.1-temurin
    container_name: zoo2
    restart: always
    hostname: zoo2
    ports:
      - 2182:2181
    environment:
      ZOO_MY_ID: 2
      ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=zoo3:2888:3888;2181
    networks:
      zookeeper-cluster:
        ipv4_address: 10.10.1.11

  zoo3:
    image: zookeeper:3.7.1-temurin
    container_name: zoo3
    restart: always
    hostname: zoo3
    ports:
      - 2183:2181
    environment:
      ZOO_MY_ID: 3
      ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=zoo3:2888:3888;2181
    networks:
      zookeeper-cluster:
        ipv4_address: 10.10.1.12

networks:
  zookeeper-cluster:
    name: zookeeper-cluster
    ipam:
      config:
        - subnet: "10.10.1.0/24"

启动成功后可以在docker可视化界面看到zk的集群启动情况

 接下来就可以正常构建hbase+hadoop的镜像了

强调:构建之前要保证刚才所说的hadoop和hbase的解压后的文件都要在你电脑本地的/opt/目录下已经存在了,因为一会启用集群的时候需要指定容器的路径和本地环境路径做共享目录,最终可以实现本地环境修改文件对应容器中的环境可以同步修改。

接下来启用集群配置文件docker-compose.yml

docker-compose up -d

docker-compose.yml文件内容如下:

version: '3'
 
services: 
  hadoop-master1: 
    image: linfer/hbase:2.5.4
    container_name: hadoop-master1
    hostname: hadoop-master1
    stdin_open: true
    tty: true
    command: 
      - sh 
      - -c 
      - | 
        /usr/sbin/sshd -D 
    volumes:
      - type: bind
        source: /opt/hadoop-3.2.4
        target: /opt/hadoop
      - type: bind
        source: /opt/hbase-2.5.4
        target: /opt/hbase
    ports: 
      - "8020:8020"
      - "8042:8042"
      - "9870:9870"
      - "8088:8088"
      - "8032:8032"
      - "10020:10020"
      - "16000:16000"
      - "16010:16010"
    networks: 
      zookeeper-cluster:
        ipv4_address: 10.10.1.20
  hadoop-master2: 
    image: linfer/hbase:2.5.4
    container_name: hadoop-master2
    hostname: hadoop-master2
    stdin_open: true
    tty: true
    command: 
      - sh 
      - -c 
      - | 
        /usr/sbin/sshd -D 
    volumes:
      - type: bind
        source: /opt/hadoop-3.2.4
        target: /opt/hadoop
      - type: bind
        source: /opt/hbase-2.5.4
        target: /opt/hbase
    ports: 
      - "28020:8020"
      - "18042:8042"
      - "29870:9870"
      - "28088:8088"
      - "28032:8032"
      - "20020:10020"
    networks:
      zookeeper-cluster:
        ipv4_address: 10.10.1.21
  hadoop-master3: 
    image: linfer/hbase:2.5.4
    container_name: hadoop-master3
    hostname: hadoop-master3
    stdin_open: true
    tty: true
    command: 
      - sh 
      - -c 
      - | 
        /usr/sbin/sshd -D 
    volumes:
      - type: bind
        source: /opt/hadoop-3.2.4
        target: /opt/hadoop
      - type: bind
        source: /opt/hbase-2.5.4
        target: /opt/hbase
    ports: 
      - "38020:8020"
      - "28042:8042"
      - "39870:9870"
      - "38088:8088"
      - "38032:8032"
      - "30020:10020"
    networks: 
      zookeeper-cluster:
        ipv4_address: 10.10.1.22
  hadoop-worker1: 
    image: linfer/hbase:2.5.4
    container_name: hadoop-worker1
    hostname: hadoop-worker1
    stdin_open: true
    tty: true
    command: 
      - sh 
      - -c 
      - | 
        /usr/sbin/sshd -D 
    volumes:
      - type: bind
        source: /opt/hadoop-3.2.4
        target: /opt/hadoop
      - type: bind
        source: /opt/hbase-2.5.4
        target: /opt/hbase
    ports: 
      - "9867:9867"
      - "38042:8042"
      - "9866:9866"
      - "9865:9865"
      - "9864:9864"
    networks: 
      zookeeper-cluster:
        ipv4_address: 10.10.1.23
  hadoop-worker2: 
    image: linfer/hbase:2.5.4
    container_name: hadoop-worker2
    hostname: hadoop-worker2
    stdin_open: true
    tty: true
    command: 
      - sh 
      - -c 
      - |
        /usr/sbin/sshd -D 
    volumes:
      - type: bind
        source: /opt/hadoop-3.2.4
        target: /opt/hadoop
      - type: bind
        source: /opt/hbase-2.5.4
        target: /opt/hbase
    ports: 
      - "29867:9867"
      - "48042:8042"
      - "29866:9866"
      - "29865:9865"
      - "29864:9864"
    networks: 
      zookeeper-cluster:
        ipv4_address: 10.10.1.24
  hadoop-worker3: 
    image: linfer/hbase:2.5.4
    container_name: hadoop-worker3
    hostname: hadoop-worker3
    stdin_open: true
    tty: true
    command: 
      - sh 
      - -c 
      - | 
        /usr/sbin/sshd -D 
    volumes:
      - type: bind
        source: /opt/hadoop-3.2.4
        target: /opt/hadoop
      - type: bind
        source: /opt/hbase-2.5.4
        target: /opt/hbase
    ports: 
      - "39867:9867"
      - "58042:8042"
      - "39866:9866"
      - "39865:9865"
      - "39864:9864"
    networks: 
      zookeeper-cluster:
        ipv4_address: 10.10.1.25
 
networks:
  zookeeper-cluster:
    external: true

启动成功后修改hadoop的配置文件,在本地环境修改即可,因为之前提到了,我们做了共享目录,所以只需要在本地修改即可

a.修改hadoop的配置文件[总共修改5个文件]

 配置文件内容比较多,我直接上传云盘,大家可以直接下载使用,将文件内容替换即可

链接: https://pan.baidu.com/s/1w0_vFeO8PDsmNla1OzcVRQ?pwd=linf 提取码: linf 

修改配置文件内容后,使用如下脚本进行hadoop hdfs初始化:

需要的话也可以网盘自取[包括第一次启动初始化脚本setup.sh以及重启后初始化脚本restart.sh]:

链接: https://pan.baidu.com/s/1dJWZt8hebtDIEimfdui-Fg?pwd=linf 提取码: linf 

使用前别忘记添加可执行权限,然后./setup.sh运行

chmod a+x setup.sh

docker exec hadoop-master1 ssh -o StrictHostKeyChecking=no hadoop-master2
docker exec hadoop-master1 ssh -o StrictHostKeyChecking=no hadoop-master3
docker exec hadoop-master2 ssh -o StrictHostKeyChecking=no hadoop-master1
docker exec hadoop-master2 ssh -o StrictHostKeyChecking=no hadoop-master3
docker exec hadoop-master3 ssh -o StrictHostKeyChecking=no hadoop-master1
docker exec hadoop-master3 ssh -o StrictHostKeyChecking=no hadoop-master2
 

echo "starting all journalnode..."
docker exec hadoop-master1 /opt/hadoop/bin/hdfs --daemon start journalnode
docker exec hadoop-master2 /opt/hadoop/bin/hdfs --daemon start journalnode
docker exec hadoop-master3 /opt/hadoop/bin/hdfs --daemon start journalnode
docker exec hadoop-worker1 /opt/hadoop/bin/hdfs --daemon start journalnode
docker exec hadoop-worker2 /opt/hadoop/bin/hdfs --daemon start journalnode
docker exec hadoop-worker3 /opt/hadoop/bin/hdfs --daemon start journalnode 
 
docker exec hadoop-master1 /opt/hadoop/bin/hdfs namenode -format 
docker exec hadoop-master1 /opt/hadoop/bin/hdfs --daemon start namenode
 
docker exec hadoop-master2 /opt/hadoop/bin/hdfs namenode -bootstrapStandby
docker exec hadoop-master2 /opt/hadoop/bin/hdfs --daemon start namenode
docker exec hadoop-master3 /opt/hadoop/bin/hdfs namenode -bootstrapStandby
docker exec hadoop-master3 /opt/hadoop/bin/hdfs --daemon start namenode
 
docker exec hadoop-master1 /opt/hadoop/sbin/stop-dfs.sh

# 如果zookeeper数据被删除,请执行下面这句话,正常情况请注释
#docker exec hadoop-master1 /opt/hadoop/bin/hdfs zkfc -formatZK
 
echo "starting zkfc..."
docker exec hadoop-master1 /opt/hadoop/bin/hdfs --daemon start zkfc
echo "starting dfs..."
docker exec hadoop-master1 /opt/hadoop/sbin/start-dfs.sh
echo "starting yarn..."
docker exec hadoop-master1 /opt/hadoop/sbin/start-yarn.sh
echo "Done!"

运行脚本后如没有验证报错信息,浏览器输入localhost:9870或localhost:19870/localhost:29870可以看到hdfs的页面信息


再输入localhost:8088、localhost:18088、localhost:28088任意一个地址可以看到yarn的监控页面

 如果这两个页面都可以正常显示且状态正常说明部署的没有问题,可以进行后续步骤

b.修改hbase的配置文件: 

1. 修改/opt/hbase/conf下的hbase-env.sh
去掉export JAVA_HOME这一行前面的注释,将其改为正确的jdk值,改好的如下
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-arm64  #非苹果的需要arm改成amd
export HBASE_MANAGES_ZK=false  #确保前面的#被删掉,确保后面的值是false
export HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP="true"  #确保前面的#被删掉,确保后面的值是true

2. 修改hbase-site.xml文件,内容如下,直接替换原有配置文件的内容即可
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop-master1:8020/hbase</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>./tmp</value>
</property>
<property>
<name>hbase.unsafe.stream.capability.enforce</name>
<value>false</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>zoo1,zoo2,zoo3</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/usr/local/zookeeper</value>
</property>

3. 修改regionservers为
hadoop-master2
hadoop-master3

4. 建立叫做backup-masters的文件    #这个文件没有,需要自己建立
hadoop-master2


5. 进入hadoop-master1
docker exec -it hadoop-master1 bash

6. 启动hbase
/opt/hbase/bin/start-hbase.sh
结束后,用jps命令可以看到HMaster进程

7. 登录到Hbase数据库
/opt/hbase/bin/hbase shell
登录后输入status命令,查看hbase集群的运行状态,可以得到如下结果
1 active master, 1 backup masters, 2 servers, 0 dead, 0.0000 average load

如果与截图中显示一致,说明hbase 的环境已经搭建完成

 文章结尾给大家分享一个关于hbase的学习文档,可以用作测试使用

链接: https://pan.baidu.com/s/1uUO94ii769h3nnVI6JuQzg?pwd=linf 提取码: linf 

祝大家顺利部署学习,有问题请留言,大家一起探讨~

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值