zookeeper-3.6.3 + hadoop-3.1.4+HA 完整版踩坑教程!!

一.准备环境

1.准备安装包

hadoop-3.1.4.tar.gz  

apache-zookeeper-3.6.3-bin.tar.gz 

jdk-8u141-linux-x64.tar.gz

2.三台虚拟机服务器

2.1三台修改虚拟机名称

hostnamectl set-hostname node01
hostnamectl set-hostname node02
hostnamectl set-hostname node03	

2.2三台修改IP映射 /etc/hosts

192.168.1.204 node01
192.168.1.205 node02
192.168.1.206 node03 

2.3三台设置jdk的环境

jdk的配置参考博客

2.4三台服务器进行对时

# 如果没用安装,先安装ntpdate
yum -y install ntpdate

# 设置ntpdate随机启动
systemctl enable ntpdate
systemctl disable ntpdate

# 运行ntpdate
ntpdate ntp4.aliyun.com

# 编辑定时任务
crontab -e

# 表示每分钟执行一次时钟同步。
*/1 * * * * /usr/sbin/ntpdate ntp4.aliyun.com

2.5关闭防火墙

# 关闭防火墙
systemctl stop firewalld.service

# 设置防火墙不可用
systemctl disable firewalld.service
vim /etc/sysconfig/selinux

# 把值由“enforcing”改成“disabled”
SELINUX=disabled

2.6三台设置免密

ssh-keygen -t rsa   #三台执行         
ssh-copy-id node01  #三台执行

scp /root/.ssh/authorized_keys node02
scp /root/.ssh/authorized_keys node03

二.安装zookeeper

1.配置文件

mkdir  -p  /opt/soft/zookeeper-3.6.3/zkdata

cp zoo_sample.cfg zoo.cfg

2.vim zoo.cfg

dataDir=/opt/soft/zookeeper-3.6.3/zkdatas

# 集群中服务器地址
server.1=node01:2888:3888
server.2=node02:2888:3888
server.3=node03:2888:3888

3.添加myid的配置

echo 1 > /opt/soft/zookeeper-3.6.3/zkdatas/myid #第一台
echo 2 > /opt/soft/zookeeper-3.6.3/zkdatas/myid #第二台
echo 3 > /opt/soft/zookeeper-3.6.3/zkdatas/myid #第二台

4.启动并查看状态

./zkServer.sh start
./zkServer.sh status

三.安装hadoop ha

1.hadoop HA 高可用图

  1. namenode:负责客户的请求和元数据的管理(查询,修改),两种状态 active和standby.
  2. JournalNode,共享edit文件,同时本身也是个集群.
  3. Zookeeper:注册节点hadoop-ha,用来做主备切换
  4. FailOverController:zookeeper一旦发现namenode挂掉,就会standby中的该进程,把active变成standby.
  5. NodeManager 通过心跳向resourcemanager汇报自己的所在节点的资源的使用情况.
  6. Resourcemanager:该进程完成集群的资源管理和调度.

2.修改配置文件

2.1hadoop-env.sh

export JAVA_HOME=/opt/soft/jdk1.8.0_141 <!--指定的位置-->
<!--放在文件的最后面-->
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
export HDFS_ZKFC_USER=root
export HDFS_JOURNALNODE_USER=root

2.2core-site.xml

<configuration>
<!-- 给集群起的名称mycluter -->
   <property>
        <name>fs.defaultFS</name>
        <value>hdfs://mycluter</value>
   </property>

<!-- 该路径一定要创建 -->
 <property>
        <name>hadoop.tmp.dir</name>
        <value>/opt/soft/hadoop-3.1.4/data</value>
   </property>
   <!-- 配置ha的zookeeper的IP和端口 -->
<property>
    <name>ha.zookeeper.quorum</name>
    <value>node01:2181,node02:2181,node02:2181</value>
</property>

 <!-- 配置用户root -->
<property>
  <name>hadoop.http.staticuser.user</name>
  <value>root</value>
</property>

</configuration>

2.3Hdfs-site.xml

<configuration>

  <!-- core-site.xml 里面的保持一致 -->
<property>
    <name>dfs.nameservices</name>
    <value>mycluter</value>
</property>

   <!-- 配置两个集群的名称nn1 nn2 -->
<property>
    <name>dfs.ha.namenodes.mycluter</name>
<value>nn1,nn2</value>
</property>

 <!--nn1的rpc通信端口 -->
<property>
   <name>dfs.namenode.rpc-address.mycluter.nn1</name>
   <value>node01:8020</value>
</property>

 <!-- nn1的http通信地址-->
<property>
   <name>dfs.namenode.http-address.mycluter.nn1</name>
   <value>node01:9870</value>
</property>

<property>
   <name>dfs.namenode.rpc-address.mycluter.nn2</name>
   <value>node02:8020</value>
</property>

<property>
   <name>dfs.namenode.http-address.mycluter.nn2</name>
   <value>node02:9870</value>
</property>

 <!--设置共享edits文件夹 -->
<property>
   <name>dfs.namenode.shared.edits.dir</name>
   <value>qjournal://node01:8485;node02:8485;node03:8485/wen</value>
</property>
<!--edits文件存储目录-->
<property>
   <name>dfs.journalnode.edits.dir</name>
   <value>/opt/soft/hadoop-3.1.4/data/journal</value>
</property>

<!--自动切换-->
<property>
   <name>dfs.ha.automatic-failover.enabled</name>
   <value>true</value>
</property>

 
<property>
   <name>dfs.client.failover.proxy.provider.mycluter</name>
   <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>

<!--避免脑裂-->
<property>
   <name>dfs.ha.fencing.methods</name>
   <value>sshfence</value>
</property>
<!--免密配置-->
<property>
   <name>dfs.ha.fencing.ssh.private-key-files</name>
   <value>/root/.ssh/id_rsa</value>
</property>

<!--namenode目录-->
<property>
   <name>dfs.namenode.name.dir</name>
   <value>file:///opt/soft/hadoop-3.1.4/data/namenode </value>
</property>

<!--datanode目录-->
<property>
    <name>dfs.datanode.data.dir</name>
    <value>file:///opt/soft/hadoop-3.1.4/data/datanode </value>
</property>


<!--三个副本-->
<property>
    <name>dfs.replication</name>
    <value>3</value>
</property>


<property>
    <name>dfs.permissions</name>
    <value>false</value>
</property>

</configuration>

4.mapred-site.xml

<configuration>
<property>
   <name>mapreduce.framework.name</name>
   <value>yarn</value>
</property>
</configuration>

5.yarn-site.xml

<configuration>
<property>
  <name>yarn.resourcemanager.ha.enabled</name>
  <value>true</value>
</property>

<property>
 <name>yarn.resourcemanager.ha.rm-ids</name>
 <value>rm1,rm2</value>
</property>

<property>
    <name>yarn.resourcemanager.hostname.rm1</name>
    <value>node01</value>
</property>

<property>
    <name>yarn.resourcemanager.hostname.rm2</name>
    <value>node02</value>
</property>
<property>
    <name>yarn.resourcemanager.recovery.enabled</name>
    <value>true</value>
</property>

<property>
    <name>yarn.resourcemanager.store.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<property>
    <name>yarn.resourcemanager.zk-address</name>
    <value>node01:2181,node02:2181,node03:2181</value>
</property>
<property>
    <name>yarn.resourcemanager.cluster-id</name>
    <value>yarn-ha</value>
</property>


<property>
    <name>yarn.resourcemanager.hostname</name>
    <value>node01</value>
</property>

<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>

<property>
    <name>yarn.resourcemanager.webapp.address.rm1</name>
    <value>node01:8088</value>
</property>


<property>
    <name>yarn.resourcemanager.webapp.address.rm2</name>
    <value>node02:8088</value>
</property>

<property>
   <name>yarn.application.classpath</name>
   <value>/opt/soft/hadoop-3.1.4/etc/hadoop:/opt/soft/hadoop-3.1.4/share/hadoop/common/lib/*:/opt/soft/hadoop-3.1.4/share/hadoop/common/*:/opt/soft/hadoop-3.1.4/share/hadoop/hdfs:/opt/soft/hadoop-3.1.4/share/hadoop/hdfs/lib/*:/opt/soft/hadoop-3.1.4/share/hadoop/hdfs/*:/opt/soft/hadoop-3.1.4/share/hadoop/mapreduce/lib/*:/opt/soft/hadoop-3.1.4/share/hadoop/mapreduce/*:/opt/soft/hadoop-3.1.4/share/hadoop/yarn:/opt/soft/hadoop-3.1.4/share/hadoop/yarn/lib/*:/opt/soft/hadoop-3.1.4/share/hadoop/yarn/*
   </value>
</property>
</configuration>

 6.Workers

node01
node02
node03

7.启动命令

7.1三台 启动journalnode

hdfs --daemon start journalnode

7.2对namenode进行格式化

hdfs namenode -format

7.3第二台启动bootstrapStandby

hdfs namenode -bootstrapStandby
hdfs --daemon start namenode

7.4三台服务启动

hdfs --daemon start datanode

7.5格式化并启动

hdfs zkfc -formatZK 
hdfs --daemon start zkfc

8.入门小案例

   vim word.txt

hello hadoop
hello world
hello flink
hello flink
hive kylin
hadoop spark
hadoop spark

/bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.4.jar wordcount /input/ /output/wc

 

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值