Hadoop+ZooKeeper+HBase 集群搭建

Hadoop+ZooKeeper+HBase 集群搭建

一.前期环境准备

1.版本选择
ZooKeeper3.4.12

下载地址:
http://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/zookeeper-3.4.12/zookeeper-3.4.12.tar.gz

Hadoop2.8.3

下载地址:http://mirrors.hust.edu.cn/apache/hadoop/common/hadoop-2.8.3/hadoop-2.8.3.tar.gz

HBase2.0

下载地址:http://mirrors.hust.edu.cn/apache/hbase/2.0.0/hbase-2.0.0-bin.tar.gz

2. 机器配置 3台centos7虚拟机
2.1 配置Host /etc/hosts
10.128.1.92 master.cnmy
10.128.1.93 slave1.cnmy
10.128.1.95 slave2.cnmy
2.2配置JDK和ntp(每个主机都需要)
[root@master bin]# java -version
java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)
JDK环境变量配置 vim /etc/profire
export JAVA_HOME=/home/jdk1.8.0_161
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
export JAVA_HOME CLASSPATH PATH
即使生效 source /etc/profile
首先安装ntp
 yum install ntp
安装完毕之后,启动服务
systemctl start ntpd.service
设置开机自启动
systemctl enable ntpd.service

二、配置各主机ssh免登陆

1、生成密钥(主机全部执行一遍)
这里略过不详细介绍,如果不明白可自行百度

三、搭建Zookeeper

1.在master.cnmy主机的home目录下
wget http://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/zookeeper-3.4.12/zookeeper-3.4.12.tar.gz

tar zxvf zookeeper-3.4.12.tar.gz

2.部署
mkdir /home/zookeeper-3.4.12/data
mkdir -p  /home/zookeeper-3.4.12/datalog
cd /home/zookeeper-3.4.12/conf
cp zoo_sample.cfg zoo.cfg
zoo.cfg内容如下:
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/home/zookeeper-3.4.12/data
dataLogDir=/home/zookeeper-3.4.12/log
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=master.cnmy:2888:3888
server.2=slave1.cnmy:2888:3888
server.3=slave2.cnmy:2888:3888
在zookeeper的data目录下创建myid文件,master机内容1,其他主机2和3;(复制后记得修改)复制到slave主机
scp -r  zookeeper-3.4.12  slave1.cnmy:/home/
scp -r  zookeeper-3.4.12  slave2.cnmy:/home/
各主机etc/profile
export ZOOKEEPER_HOME=/home/zookeeper-3.4.12

export PATH=$PATH:$ZOOKEEPER_HOME/bin:$ZOOKEEPER_HOME/conf

即使生效 source /etc/profile
3、启动Zookeeper各主机启动
zkServer.sh start
zkServer.sh start  
root@master:/home# zkServer.sh startZooKeeper JMX enabled by defaultUsing
config: /home/zookeeper-3.4.12/bin/../conf/zoo.cfgStarting zookeeper 
... STARTED

4、常用命令
# 启动
zkServer.sh start
 
# 停止
zkServer.sh stop
 
# 状态
 
zkServer.sh status

四、搭建Hadoop

1.下载
wget http://mirrors.hust.edu.cn/apache/hadoop/common/hadoop-2.8.3/hadoop-2.8.3.tar.gz
tar zxvf hadoop-2.8.3.tar.gz
2.配置
2.1在各主机上建立相关目录
mkdir  /home/data
mkdir  /home/data/journal
mkdir  /home/data/tmp
mkdir  /home/data/hdfs
mkdir  /home/data/hdfs/data
mkdir  /home/data/hdfs/name

2.2在/home/hadoop-2.8.3/etc/hadoop目录下配置core-site.xml
<configuration>
	<!-- 指定hdfs的nameservice为ns -->
     <property>
          <name>fs.defaultFS</name>
          <value>hdfs://ns</value>
     </property>
     <!--指定hadoop数据临时存放目录-->
     <property>
          <name>hadoop.tmp.dir</name>
          <value>/home/data/tmp</value>
     </property>
     <property>
          <name>io.file.buffer.size</name>
          <value>4096</value>
     </property>
     <!--指定zookeeper地址-->
     <property>
          <name>ha.zookeeper.quorum</name>
          <value>master.cnmy:2181,slave1.cnmy:2181,slave2.cnmy:2181</value>
     </property>
</configuration>
2.3在/home/hadoop-2.8.3/etc/hadoop目录下配置hdfs-site.xml
<configuration>
	<!--指定hdfs的nameservice为ns,需要和core-site.xml中的保持一致 -->
    <property>
        <name>dfs.nameservices</name>
        <value>ns</value>
    </property>
    <!-- ns下面有两个NameNode,分别是nn1,nn2 -->
    <property>
       <name>dfs.ha.namenodes.ns</name>
       <value>nn1,nn2</value>
    </property>
    <!-- nn1的RPC通信地址 -->
    <property>
       <name>dfs.namenode.rpc-address.ns.nn1</name>
       <value>master.cnmy:9000</value>
    </property>
    <!-- nn1的http通信地址 -->
    <property>
        <name>dfs.namenode.http-address.ns.nn1</name>
        <value>master.cnmy:50070</value>
    </property>
    <!-- nn2的RPC通信地址 -->
    <property>
        <name>dfs.namenode.rpc-address.ns.nn2</name>
        <value>slave1.cnmy:9000</value>
    </property>
    <!-- nn2的http通信地址 -->
    <property>
        <name>dfs.namenode.http-address.ns.nn2</name>
        <value>slave1.cnmy:50070</value>
    </property>
    <!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
    <property>
         <name>dfs.namenode.shared.edits.dir</name>
         <value>qjournal://master.cnmy;slave1.cnmy;slave2.cnmy/ns</value>
    </property>
    <!-- 指定JournalNode在本地磁盘存放数据的位置 -->
    <property>
          <name>dfs.journalnode.edits.dir</name>
          <value>/home/data/journal</value>
    </property>
    <!-- 开启NameNode故障时自动切换 -->
    <property>
          <name>dfs.ha.automatic-failover.enabled</name>
          <value>true</value>
    </property>
    <!-- 配置失败自动切换实现方式 -->
    <property>
            <name>dfs.client.failover.proxy.provider.ns</name>
            <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    <!-- 配置隔离机制,如果ssh是默认22端口,value直接写sshfence即可(hadoop:22022) -->
    <property>
             <name>dfs.ha.fencing.methods</name>
             <!-- <value>sshfence</value> -->
                 <value>
                    sshfence
                    shell(/bin/true)
                </value>
    </property>
    <!-- 使用隔离机制时需要ssh免登陆 -->
    <property>
            <name>dfs.ha.fencing.ssh.private-key-files</name>
            <value>/root/.ssh/id_rsa</value>
    </property>
 
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:/home/data/hdfs/name</value>
    </property>
 
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:/home/data/hdfs/data</value>
    </property>
 
    <property>
       <name>dfs.replication</name>
       <value>1</value>
    </property>
    <!-- 在NN和DN上开启WebHDFS (REST API)功能,不是必须 -->
    <property>
       <name>dfs.webhdfs.enabled</name>
       <value>true</value>
    </property>

</configuration>

2.4在/home/hadoop-2.8.3/etc/hadoop目录下配置mapred-site.xml(内存配置问题,是主机只有1G内存。内存大可不用配置。)

<configuration>
	<property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.application.classpath</name>
        <value>
            /home/hadoop-2.8.3/etc/hadoop,
            /home/hadoop-2.8.3/share/hadoop/common/*,
           /home/hadoop-2.8.3/share/hadoop/common/lib/*,
            /home/hadoop-2.8.3/share/hadoop/hdfs/*,
           /home/hadoop-2.8.3/share/hadoop/hdfs/lib/*,
            /home/hadoop-2.8.3/share/hadoop/mapreduce/*,
           /home/hadoop-2.8.3/share/hadoop/mapreduce/lib/*,
           /home/hadoop-2.8.3/share/hadoop/yarn/*,
           /home/hadoop-2.8.3/share/hadoop/yarn/lib/*
        </value>
    </property>
    <property>
  <name>mapreduce.map.memory.mb</name>
    <value>512</value>
    </property>
    <property>
      <name>mapreduce.map.java.opts</name>
      <value>-Xmx512M</value>
    </property>
    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>512</value>
    </property>
    <property>
      <name>mapreduce.reduce.java.opts</name>
      <value>-Xmx256M</value>
    </property>

</configuration>

2.5在/home/hadoop-2.8.3/etc/hadoop目录下配置yarn-site.xml
<configuration>

<!-- Site specific YARN configuration properties -->
	<property>
        <name>yarn.resourcemanager.hostname</name>
        <value>master.cnmy</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property> 
        <description>The address of the RM web application.</description> 
        <name>yarn.resourcemanager.webapp.address</name> 
        <value>master.cnmy:18008</value> 
    </property>
</configuration>
2.6在/home/hadoop-2.8.3/etc/hadoop目录下创建配置slaves
master.cnmy
slave1.cnmy
slave2.cnmy
2.7在/home/hadoop-2.8.3/etc/hadoop目录下配置hadoop-env.sh
export JAVA_HOME=/home/jdk1.8.0_161
export HADOOP_OPTS="$HADOOP_OPTS -Duser.timezone=GMT+08"
2.8在/home/hadoop-2.8.3/etc/hadoop目录下配置yarn-env.sh
YARN_OPTS="$YARN_OPTS -Duser.timezone=GMT+08"
配置etc/profile
export HADOOP_HOME=/home/hadoop-2.8.3
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

即使生效 source /etc/profile
3、配置slave

复制到slave1.cnmy slave2.cnmy

cd /home
scp -r  hadoop-2.8.3  slave1.cnmy:/home/
scp -r  hadoop-2.8.3  slave2.cnmy:/home/

4、首次启动
# 1、首先启动各个节点的Zookeeper,在各个节点上执行以下命令:(每个主机)
zkServer.sh start

# 2、在某一个namenode节点执行如下命令,创建命名空间(master.cnmy主机)
hdfs zkfc -formatZK

# 3、在每个journalnode节点用如下命令启动journalnode(每个主机)
hadoop-daemon.sh start journalnode

# 4、在主namenode节点格式化namenode和journalnode目录(master.cnmy主机)
hdfs namenode -format

# 5、在主namenode节点启动namenode进程(master.cnmy主机)
hadoop-daemon.sh start namenode


# 7、在所有datanode节点都执行以下命令启动datanode(master.cnmy主机)
hadoop-daemon.sh start datanode
 
# 8、在 RM1 启动 YARN
start-yarn.sh

# 6、在两个namenode节点都执行以下命令(master.cnmy主机)
hadoop-daemon.sh start zkfc

# 7. 开启历史日志服务 (master.cnmy主机)
mr-jobhistory-daemon.sh   start historyserver

 
5、验证

http://10.128.1.92:18008/cluster/nodes
hadoop_start.png

五、搭建Hbase

1.下载
wget http://mirrors.hust.edu.cn/apache/hbase/2.0.0/hbase-2.0.0-bin.tar.gz
tar -zvxf hbase-2.0.0-bin.tar.gz
2.配置
2.1在/home/hbase-2.0.0/conf目录下配置hbase-env.sh
export JAVA_HOME=/home/jdk1.8.0_161

export HBASE_CLASSPATH=/home/hadoop-2.8.3/etc/hadoop

export HBASE_MANAGES_ZK=false

export TZ="Asia/Shanghai"

关闭hbase自带的zookeeper,这个只能测试,不能生产环境。
classpath一定要改成hadoop的目录,不然不认识集群名称。
网上大部分教程都不是真正的分布式。

2.2在/home/hbase-2.0.0/conf目录下配置hbase-site.xml
<configuration>
	<property>  
       <name>hbase.rootdir</name>  
       <value>hdfs://ns/hbase</value>  
   </property>  
       <!--启用分布式集群-->  
   <property>  
       <name>hbase.cluster.distributed</name>  
       <value>true</value>  
   </property>  
       <!--默认HMaster HTTP访问端口-->  
   <property>  
       <name>hbase.master.info.port</name>  
       <value>16010</value>  
    </property>  
       <!--默认HRegionServer HTTP访问端口-->  
    <property>  
       <name>hbase.regionserver.info.port</name>  
       <value>16030</value>  
    </property>  
   <property>  
       <name>hbase.zookeeper.quorum</name>  
       <value>master.cnmy:2181,slave1.cnmy:2181,slave2.cnmy:2181</value> 
   </property> 
 <property>
    <name>hbase.coprocessor.abortonerror</name>
    <value>false</value>
    </property>

</configuration>

ns是前面配置的namenode集群名称

2.3在/home/hbase-2.0.0/conf目录下创建配置regionservers
slave1.cnmy
slave2.cnmy
2.4 配置 etc/profile
export HBASE_HOME=/home/hbase-2.0.0
 
export PATH=$HBASE_HOME/bin:$PATH
即使生效 source /etc/profile
3.启动
复制到slave1.cnmy slave2.cnmy
cd /home/
scp -r  /home/hbase-2.0.0  slave1.cnmy:/home/
scp -r  /home/hbase-2.0.0  slave2.cnmy:/home/
注意在master.cnmy主机中/home/hbase-2.0.0/lib下去掉 slf4j-api-1.7.25.jar, slf4j-log4j12-1.7.25.jar不然会jar冲突导致 hbase启动闪退,
# 启动主 HMaster (master.cnmy主机)
start-hbase.sh
4、常用命令
# 启动(master机器)
start-hbase.sh
# 关闭
stop-hbase.sh
# 启动节点
hbase-daemon.sh start regionserver
5、验证

http://10.128.1.92:16010/master-status
hbase_start01.png

http://10.128.1.93:16030/rs-status
hbase_slave1_start.png

http://10.128.1.95:16030/rs-status
hbase_slave2_start.png

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值