Hadoop HA环境搭建
安装JDK
- 解压
- 配置环境变量
配置静态网络
-
修改【/etc/sysconfig/network-scripts/ifcfg-eth0】
$>su root $>gedit /etc/sysconfig/network-scripts/ifcfg-eth0 修改: BOOTPROTO="static" #原值为DHCP 添加: IPADDR="192.168.14.39" #静态IP NETMASK="255.255.255.0" #子网掩码 NETWORK="192.168.14.0" #子网 GATEWAY="192.168.14.2" #网关 DNS1="192.168.14.2" #网关
所有节点都改
-
重启网络
$>service network restart
修改机器名(可选)
没有IP映射,也可以直接使用IP地址来表示;最好改了。
1.说明:
[hadoop@localhost Desktop]$
@之前的hadoop为用户名
@之后的localhost为机器名,也可以理解为IP映射地址
2.进入到【/etc】下的hosts文件,修改IP与主机名的映射关系
追加:192.168.14.39 master
3.修改【/etc/sysconfig/network】文件
修改HOSTNAME属性为:master
HOSTNME=master
注意:机器名不能含有“_”下划线;
4.重启系统,即可!
$>reboot
5.修改【C:\Windows\System32\drivers\etc】下的hosts文件
如果无hosts文件,则需手动创建hosts文件
添加:192.168.14.39 master
cmd>ping master
关闭防火墙
$>service iptables stop
$>systenctl stop firewalld
配置无密登录
$>ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
$>cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$>chmod 0600 ~/.ssh/authorized_keys
安装Hadoop
-
解压
-
配置环境变量
-
配置文件
hdfs-site.xml
<configuration> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file://${hadoop.tmp.dir}/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file://${hadoop.tmp.dir}/dfs/data</value> </property> <property> <name>dfs.blocksize</name> <value>128m</value> </property> <property> <name>dfs.namenode.fs-limits.min-block-size</name> <value>1048576</value> </property> <property> <name>dfs.bytes-per-checksum</name> <value>512</value> </property> <!--HA--> <property> <name>dfs.nameservices</name> <value>mycluster</value> </property> <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>master:9000</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>slave01:9000</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>master:50070</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>slave01:50070</value> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value> qjournal://master:8485;slave01:8485;slave02:8485/mycluster </value> </property> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider </value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/jinlihuan/.ssh/id_rsa</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/home/jinlihuan/tmp/hadoop-jinlihuan</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>shell(/bin/true)</value> </property> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> </configuration>
core-site.xml
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://mycluster</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/jinlihuan/tmp/hadoop-${user.name}</value> </property> <property> <name>ha.zookeeper.quorum</name> <value>master:2181,slave01:2181,slave02:2181</value> </property> <property> <name>hadoop.proxyuser.hadoop.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hadoop.groups</name> <value>*</value> </property> </configuration>
mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
yarn-site.xml
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.hostname</name> <value>master</value> </property> <property> <name>yarn.nodemanager.pmem-check-enabled</name> <value>false</value> </property> <property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> </property> </configuration>
slaves
master slave01 slave02
hadoop-env.sh
修改JDK位置
-
将修改的文件分发到其他节点
$>cd $HADOOP_HOME/etc/ $>scp -r hadoop hyxy@slave01:~/soft/hadoop/etc/ $>scp -r HA_hadoop hyxy@slave02:~/soft/hadoop/etc/
-
每个节点都打开Journalnode
$>hadoop-daemon.sh start journalnode
-
格式化master节点
$>hdfs namenode -format
-
同步namenode或者将master的name文件复制到slave01节点
$>hdfs namenode -bootstrapStandby
安装Zookeeper
-
解压
-
配置环境变量
-
修改【{ZOOKEEPER_HOME}/conf/zoo.cfg】
tickTime=2000 initLimit=10 syncLimit=5 dataDir=/home/hadoop/tmp/zookeeper clientPort=2181
-
将配置文件分发到所有节点
-
在/home/hadoop/tmp/zookeeper目录下,创建myid文件
$>echo "1" >> myid //在master节点 $>echo "2" >> myid //在slave01节点 $>echo "3" >> myid //在slave01节点
-
格式化zk
$>hdfs zkfc -formatZK
启动
$>zkServer.sh start #每个节点都启动
$>start-dfs.sh
$>start-yarn.sh
验证
打开网页master:9000
和slave01:9000
,二者有一个处于active
状态,一个处于standby
状态
kill掉活跃节点,刷新另一个节点网页,发现另一个节点活了,则高可用可用。