CentOSA | CentOSB | CentOS |
192.168.199.131 | 192.168.199.132 | 192.168.199.133 |
zookeeper | zookeeper | zookeeper |
journalnode | journalnode | journalnode |
nn1 | nn2 | |
zkfc | zkfc | |
datanode | datanode | datanode |
nodemanager | nodemanager | nodemanager |
resourceManager |
1)关闭所有节点的防火墙
[root@CentOSX ~]# service iptables stop
iptables: Setting chains to policy ACCEPT: filter [ OK ]
iptables: Flushing firewall rules: [ OK ]
iptables: Unloading modules: [ OK ]
设置防火墙开机不自启
[root@CentOSX ~]# chkconfig iptables off
2)配置主机名和IP映射关系
[root@CentOSX ~]# vi /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.199.131 CentOSA
192.168.199.132 CentOSB
192.168.199.133 CentOSC
3)必须保证所有的操作系统时间一致
[root@CentOSX ~]# date
Fri Jul 6 15:17:16 CST 2018
如果不一致设置时间一致
[root@CentOSX ~]# date -s '2018-07-06 15:17:16'
Fri Jul 6 15:17:16 CST 2018
4) 安装JDK配置JAVA_HOME
[root@CentOSX ~]# rpm -ivh jdk-7u79-linux-x64.rpm
[root@CentOSX ~]# vi .bashrc
JAVA_HOME=/usr/java/latest
PATH=$PATH:$JAVA_HOME/bin
CLASSPATH=.
export JAVA_HOME
export PATH
export CLASSPATH
[root@CentOSX ~]# source .bashrc
5) SSH免密码认证
[root@CentOSX ~]# ssh-keygen -t rsa
[root@CentOSX ~]# ssh-copy-id CentOSA
[root@CentOSX ~]# ssh-copy-id CentOSB
[root@CentOSX ~]# ssh-copy-id CentOSC
6)配置安装zookeeper
[root@CentOSX ~]# tar -zxf zookeeper-3.4.6.tar.gz -C /usr/
[root@CentOSX ~]# mkdir /root/zkdata
[root@CentOSX ~]# vim /usr/zookeeper-3.4.6/conf/zoo.cfg
#心跳时间,为了确保连接存在的,以毫秒为单位,最小超时时间为两个心跳时间
tickTime=2000
#用于存放内存数据库快照的文件夹,同时用于集群的myid文件也存在这个文件夹里(注意:一个配置文件只能包含一个dataDir字样,即使它被注释掉了。)
dataDir=/root/zkdata
#服务的监听端口
clientPort=2181
#多少个心跳时间内,允许其他server连接并初始化数据,如果ZooKeeper管理的数据较大,则应相应增大这个值
initLimit=5
#多少个tickTime内,允许follower同步,如果follower落后太多,则会被丢弃。
syncLimit=2
#服务器名称与地址:集群信息(服务器编号,服务器地址,LF通信端口,选举端口)
server.1=CentOSA:2887:3887
server.2=CentOSB:2887:3887
server.3=CentOSC:2887:3887
server.A=B:C:D:
A是一个数字,表示这个是第几号服务器,B是这个服务器的ip地址
C第一个端口用来集群成员的信息交换,表示的是这个服务器与集群中的Leader服务器交换信息的端口
D是在leader挂掉时专门用来进行选举leader所用
[root@CentOSA ~]# echo 1 > /root/zkdata/myid
[root@CentOSB ~]# echo 2 > /root/zkdata/myid
[root@CentOSC ~]# echo 3 > /root/zkdata/myid
echo 1 > /root/zkdata/myid
[root@CentOSB ~]# echo 2 > /root/zkdata/myid
[root@CentOSC ~]# echo 3 > /root/zkdata/myid
7)启动ZK集群
[root@CentOSX ~]# cd /usr/zookeeper-3.4.6/
[root@CentOSX zookeeper-3.4.6]# ./bin/zkServer.sh start zoo.cfg
[root@CentOSX zookeeper-3.4.6]# ./bin/zkServer.sh status zoo.cfg
此时可以查看到集群的状态
CentOSA | CentOSB | CentOSC |
follower | follower | leader |
8)并解压Hadoop安装包到/usr目录下,配置HADOOP_HOME环境变量
[root@CentOSX ~]# tar -zxvf hadoop-2.6.0_x64.tar.gz -C /usr/
[root@CentOSX ~]# vim .bashrc
HADOOP_HOME=/usr/hadoop-2.6.0
JAVA_HOME=/usr/java/latest
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
CLASSPATH=.
export JAVA_HOME
export PATH
export CLASSPATH
export HADOOP_HOME
[root@CentOSX ~]# source .bashrc
HADOOP_HOME=/usr/hadoop-2.6.0
JAVA_HOME=/usr/java/latest
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
CLASSPATH=.
export JAVA_HOME
export PATH
export CLASSPATH
export HADOOP_HOME
[root@CentOSX ~]# source .bashrc
9) 配置HADOOP配置文件(重点参考HDFS HA QJM文档)
①core-site.xml
[root@CentOSX zookeeper-3.4.6]# vim /usr/hadoop-2.6.0/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<!-- Hadoop FS 客户端使用的前缀 -->
<value>hdfs://mycluster</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/hadoop-2.6.0/tmp-${user.name}</value>
</property>
<!-- 配置Hadoop机架脚本 -->
<property>
<name>net.topology.script.file.name</name>
<!-- 机架脚本 -->
<value>/usr/hadoop-2.6.0/etc/hadoop/rack.sh</value>
</property>
</configuration>
机架脚步rack.sh
[root@CentOSX ~]# vim /usr/hadoop-2.6.0/etc/hadoop/rack.sh
while [ $# -gt 0 ] ; do
nodeArg=$1
exec</usr/hadoop-2.6.0/etc/hadoop/topology.data
result=""
while read line ; do
ar=( $line )
if [ "${ar[0]}" = "$nodeArg" ] ; then
result="${ar[1]}"
fi
done
shift
if [ -z "$result" ] ; then
echo -n "/default-rack"
else
echo -n "$result"
fi
done
添加执行权限(切记topology.data文件尾部需要空一行)
[root@CentOSX ~]# chmod u+x /usr/hadoop-2.6.0/etc/hadoop/rack.sh
[root@CentOSX ~]# vim /usr/hadoop-2.6.0/etc/hadoop/topology.data
192.168.199.131 /rack1
192.168.199.132 /rack1
192.168.199.133 /rack2
测试机架脚本:
[root@CentOSX ~]# /usr/hadoop-2.6.0/etc/hadoop/rack.sh 192.168.199.131
/rack1[root@CentOSX ~]# /usr/hadoop-2.6.0/etc/hadoop/rack.sh 192.168.199.13
2
/rack1[root@CentOSX ~]# /usr/hadoop-2.6.0/etc/hadoop/rack.sh 192.168.199.13
3
/rack2[root@CentOSX ~]# /usr/hadoop-2.6.0/etc/hadoop/rack.sh 192.168.199.13
4
/default-rack[root@CentOSA ~]#
②hdfs-site.xml
[root@CentOSX ~]# vim /usr/hadoop-2.6.0/etc/hadoop/hdfs-site.xml
vim /usr/hadoop-2.6.0/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<!-- block副本因子 -->
<value>3</value>
</property>
<!-- 开启zookeeper自动故障转移 -->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>CentOSA:2181,CentOSB:2181,CentOSC:2181</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>CentOSA:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>CentOSB:9000</value>
</property>
<!-- 日志节点,用于同步namenode间数据 -->
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://CentOSA:8485;CentOSB:8485;CentOSC:8485/mycluster</value>
</property>
<!-- 故障转移切换实现 -->
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
</configuration>
③slaves
[root@CentOSX hadoop-2.6.0]# vim /usr/hadoop-2.6.0/etc/hadoop/slaves
CentOSA
CentOSB
CentOSC
④mapred-site.xml
[root@CentOSX ~]# cp /usr/hadoop-2.6.0/etc/hadoop/mapred-site.xml.template /usr/hadoop-2.6.0/etc/hadoop/mapred-site.xml
[root@CentOSX ~]# vim /usr/hadoop-2.6.0/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
⑤yarn-site.xml
[root@CentOSX ~]# vim /usr/hadoop-2.6.0/etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>CentOSC</value>
</property>
</configuration>
8)启动hadoop集群(首次启动,必须注意顺序)
0.启动zookeeper集群
1.启动所有journalnode(等上10秒)
[root@CentOSX ~]# hadoop-daemon.sh start journalnode
2.格式化namenode(CentOSA)
[root@CentOSA ~]# hdfs namenode -format
hdfs namenode -format
3.启动namenode(CentOSA)
[root@CentOSA ~]# hadoop-daemon.sh start namenode
hadoop-daemon.sh start namenode
4.引导格式化namenode(CentOSB)
[root@CentOSB ~]# hdfs namenode -bootstrapStandby
hdfs namenode -bootstrapStandby
5.启动namenode(CentOSB)
[root@CentOSB ~]# hadoop-daemon.sh start namenode
hadoop-daemon.sh start namenode
6.注册namenode到zk中(CentOSA或者CentOSB任意一台运行)
[root@CentOSA|B ~]# hdfs zkfc -formatZK
hdfs zkfc -formatZK
7.分别在CentOSA和CentOSB启动zkfc检测用于检测Namenode健康状况
[root@CentOSA ~]# hadoop-daemon.sh start zkfc
[root@CentOSB ~]# hadoop-daemon.sh start zkfc
hadoop-daemon.sh start zkfc
[root@CentOSB ~]# hadoop-daemon.sh start zkfc
8.分别在所有节点启动datanode服务
[root@CentOSX ~]# hadoop-daemon.sh start datanode
启动Yarn (单机) 登录CentOSC
[root@CentOSC ~]# start-yarn.sh
start-yarn.sh
关闭/启动集群(任意一台)
[root@CentOSA|B|C ~]# stop|start-dfs.sh
stop|start-dfs.sh