大数据伪分布式平台-HA搭建基于原 大数据伪分布式平台搭建过程,过程修改必要配置文件。大数据相关jar包可在https://www.siyang.site/portfolio/出下载
-
zookeeper安装
利用sshfile传输hadoop至node2的root目录并解压移动到/opt/home/目录下
tar xf zookeeper-3.4.6.tar.gz mv zookeeper-3.4.6 /opt/home/
配置文件增加/etc/profile
export ZOOKEEPER_HOME=/opt/home/zookeeper-3.4.6
export PATH=$PATH:$ZOOKEEPER_HOME/bin
刷新配置
source /etc/profile
修改zookeeper的配置文件/opt/home/zookeeper-3.4.6/conf
复制zoo_sample.cfg为zoo.cfg
cp zoo_sample.cfg zoo.cfg
修改zoo.cfg
dataDir=/var/home/hadoop/zk
server.1=192.168.127.102:2888:3888
server.2=192.168.127.103:2888:3888
server.3=192.168.127.104:2888:3888
创建/var/home/hadoop/zk文件夹并将1(id号,根据上server.1的号码)写入myid文件中
mkdir /var/home/hadoop/zk echo 1 >/var/home/hadoop/zk/myid
将node2的zookeeper复制到node3、node4的/opt/home/下
scp -r ../../zookeeper-3.4.6/ root@node3:/opt/home/
scp -r ../../zookeeper-3.4.6/ root@node4:/opt/home/
node3、node4分别创建/var/home/hadoop/zk文件夹并分别将2、3(id号)写入myid文件中
mkdir /var/home/hadoop/zk echo 2>/var/home/hadoop/zk/myid
将node2的/etc/profile文件覆盖到node3、node4的/etc/profile
scp /etc/profile root@node3:/etc/profile
并分别刷新配置文件
source /etc/profile
开启zookeeper命令为
zkServer.sh start
修改hadoop配置文件-》HA
可以先将之前的分布式文件平台的配置文件复制一份为hadoop-full(/opt/home/hadoop-2.6.5/etc)
cp -r hadoop hadoop-full
在原etc/hadoop文件的基础下修改配置
-
逻辑到物理的映射
hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property> <property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>node1:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>node2:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>node1:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>node2:50070</value>
</property>
-
JNN相关位置信息的描述
hdfs-site.xml
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://node1:8485;node2:8485;node3:8485/mycluster</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/var/home/hadoop/ha/jn</value>
</property>
-
故障切换实现代理
hdfs-site.xml
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_dsa</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property> <property>
<name>hadoop.tmp.dir</name>#hdfs文件目录位置
<value>/var/home/hadoop/full</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>node2:2181,node3:2181,node4:2181</value>
</property>
将hdfs-site.xml与core-site.xml复制到node2、node3、node4节点同目录上
scp hdfs-site.xml core-site.xml root@node2:/opt/home/hadoop-2.6.5/etc/hadoop/
-
node2的免秘钥登录
ssh-keygen -t dsa -P '' -f /root/.ssh/id_dsa
追加id_dsa.pub 到authorized_keys 下
cat id_dsa.pub >>authorized_keys
将node2的id_dsa.pub发送给node1、node3、node4
scp id_dsa.pub root@node1:/root/.ssh/node2.pub
#并在发送到的节点上将node2.pub追加到authorized_keys下
cat id_dsa.pub >>authorized_keys
-
启动
括号内为哪个节点使用命令
hadoop-daemon.sh start journalnode(1 2 3)
hdfs namenode -format(1)#搭建后使用一次
hadoop-daemon.sh start namenode(1)#开启namenode
hdfs namenode -bootstrapStandby(2)
hdfs zkfc -formatZK(1)#搭建后使用一次
start-dfs.sh(1)
-
制作namenode ssh远程操作hadoop-ha脚本
开启hadoop-ha脚本hadoop-ha-start.sh
#!/bin/bash
for i in {2..4} ; #启动node234的zookeeper
do
ssh hadoop$i "/opt/home/zookeeper-3.4.6/bin/zkServer.sh start" ;
done
start-dfs.sh#启动hadoop集群
关闭hadoop-ha脚本hadoop-ha-stop.sh
#!/bin/bash
stop-dfs.sh #关闭hadoop集群
for i in {2..4};#关闭node234的zookeeper
do
ssh hadoop$i "/opt/home/zookeeper-3.4.6/bin/zkServer.sh stop" ;
done
-
日常启动
hadoop-ha-start.sh
- 日常关闭
hadoop-ha-stop.sh