hadoop ha集群安装部署

 本文参考:

http://blog.csdn.net/carl810224/article/details/52160418

http://blog.csdn.net/Dr_Guo/article/details/50975851

1.准备文件

操作系统:CentOS Linux release 7.0.1406

JDK:Java(TM) SE Runtime Environment (build1.8.0_73-b02)

Hadoop:hadoop-2.9.0.tar.gz

ZooKeeper:zookeeper-3.4.5-cdh5.7.6.tar.gz

 

 

 

2.集群框架

 

 

 

 

 

 

 

 

 

3.服务器列表

主机名

操作系统

IP地址

安装的软件

JPS

hadoop-master1

Red Hat Enterprise Linux Server release 7.2 (Maipo)

172.18.98.238

Jdk/hadoop

Namenode/zkfc/resourcemanager/

JobHistoryServer

hadoop-master2

Red Hat Enterprise Linux Server release 7.2 (Maipo)

172.18.98.223

Jdk/hadoop

Namenode/zkfc/resourcemanager/

WebProxyServer

hadoop-slave1

Red Hat Enterprise Linux Server release 7.2 (Maipo)

172.18.98.239

Jdk/hadoop/zookeeper

Datanode/journalnode/nodemanager/

quorumPeerMain

hadoop-slave2

Red Hat Enterprise Linux Server release 7.2 (Maipo)

172.18.98.240

Jdk/hadoop/zookeeper

Datanode/journalnode/nodemanager/

quorumPeerMain

hadoop-slave3

Red Hat Enterprise Linux Server release 7.2 (Maipo)

172.18.98.241

Jdk/hadoop/zookeeper

Datanode/journalnode/nodemanager/

quorumPeerMain

 

4.Linux环境准备

集群各节点进行如下修改配置

4.1 创建用户并添加权限

// 切换root用户

su root

// 创建founder用户组

groupadd founder

// 在founder用户组中创建founder用户

useradd -g founder founder

// 修改用户founder密码

passwd founder

// 修改sudoers配置文件给founder用户添加sudo权限

vi /etc/sudoers

founder   ALL=(ALL)       ALL

// 测试是否添加权限成功

exit

sudo ls /root

显示内容anaconda-ks.cfg(待确认是否成功)

4.2 修改主机名

// 切换root用户

su root

// 修改主机名

hostnamectl set-hostname XXX  (XXX为主机名)

五台分别为hadoop-master1、hadoop-master2、hadoop-slave1、hadoop-slave2、hadoop-slave3

// 重启机器

reboot

// 查看主机名

hostname

 

4.3 修改hosts

// 切换root用户

su root

// 编辑hosts文件

vi /etc/hosts

172.18.98.238   hadoop-master1

172.18.98.223   hadoop-master2

172.18.98.239   hadoop-slave1

172.18.98.240   hadoop-slave2

172.18.98.241   hadoop-slave3

//重启机器

reboot

 

4.4 关闭防火墙

// 切换root用户

su root

// 停止firewall防火墙

systemctl stop firewalld.service

// 禁止firewall开机启动

systemctl disable firewalld.service

4.5 配置SSH免密码登录

先把之前的步骤在五台机器都执行后再进行这一步

生成密钥 使用founder用户

su founder

在hadoop-master1 的/home/founder下执行ssh-keygen -trsa 生成密钥,一路enter,成功后如下图所示:

在hadoop-master1 的/home/founder/.ssh 下分别执行以下命令:

ssh-copy-id -i ~/.ssh/id_rsa.pub founder@hadoop-master2

ssh-copy-id -i ~/.ssh/id_rsa.pub founder@hadoop-slave1

ssh-copy-id -i ~/.ssh/id_rsa.pub founder@hadoop-slave2

ssh-copy-id -i ~/.ssh/id_rsa.pub founder@hadoop-slave3

 

在hadoop-master2 的/home/founder下执行ssh-keygen -trsa 生成密钥

在hadoop-master2 的/home/founder/.ssh 下执行

ssh-copy-id -i ~/.ssh/id_rsa.pub founder@hadoop-master1

ssh-copy-id -i ~/.ssh/id_rsa.pub founder@hadoop-slave1

ssh-copy-id -i ~/.ssh/id_rsa.pub founder@hadoop-slave2

ssh-copy-id -i ~/.ssh/id_rsa.pub founder@hadoop-slave3

 

配置完成之后,在hadoop-master1上执行

ssh hadoop-slave2

将不需要密码直接登录slave2的founder账号,

exit 可退出

 

4.6 安装JDK

// 使用root用户新建目录

cd /opt

mkdir founder

chown -R founder:founder founder

// 将jdk文件夹拷贝到/opt/founder路径下,并配置相关环境变量

vi /etc/profile  在文件最后添加

export JAVA_HOME=/opt/founder/jdk1.8.0_73

export PATH=$JAVA_HOME/bin:$PATH

export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

保存后退出

source /etc/profile

// Jdk设置普通用户可以访问

所有机器范围、jdk版本1.8.0_73

使用root用户,执行 sudochmod -R 755 /opt/founder

然后切换到founder用户,执行 java–version

无权限

有权限

 

 

5 Zookeeper集群安装

Zookeeper是一个开源分布式协调服务,其独特的Leader-Follower集群结构,很好的解决了分布式单点问题。目前主要用于诸如:统一命名服务、配置管理、锁服务、集群管理等场景。大数据应用中主要使用Zookeeper的集群管理功能。

本集群使用zookeeper-3.4.5-cdh5.7.1版本。首先在hadoop-slave1节点安装Zookeeper,方法如下:

// 使用founder用户

su founder

// 解压zookeeper安装包

tar -xvf zookeeper-3.4.5-cdh5.7.6.tar.gz -C /opt/fouder/

// 删除安装包

rm -rf zookeeper-3.4.5-cdh5.7.6.tar.gz

// 配置founder用户环境变量

vi /home/founder/.bash_profile

export ZOOKEEPER_HOME=/opt/founder/zookeeper-3.4.5-cdh5.7.6

export PATH=$PATH:$ZOOKEEPER_HOME/bin

// 使修改的环境变量生效

source /home/founder/.bash_profile

// 修改zookeeper的配置文件

cd /opt/founder/zookeeper-3.4.5-cdh5.7.6/conf/

cp zoo_sample.cfg zoo.cfg

vi zoo.cfg

# 客户端心跳时间(毫秒)

tickTime=2000

# 允许心跳间隔的最大时间

initLimit=10

# 同步时限

syncLimit=5

# 数据存储目录

dataDir=/opt/founder/zookeeper-3.4.5-cdh5.7.6/data

# 数据日志存储目录

dataLogDir=/opt/founder/zookeeper-3.4.5-cdh5.7.6/data/log

# 端口号

clientPort=2181

# 集群节点和服务端口配置

server.1=hadoop-slave1:2888:3888

server.2=hadoop-slave2:2888:3888

server.3=hadoop-slave3:2888:3888

# 以下为优化配置

# 服务器最大连接数,默认为10,改为0表示无限制

maxClientCnxns=0

# 快照数

autopurge.snapRetainCount=3

# 快照清理时间,默认为0

autopurge.purgeInterval=1

// 创建zookeeper的数据存储目录和日志存储目录

cd /opt/founder/zookeeper-3.4.5-cdh5.7.6

mkdir -p data/log

// 在data目录中创建一个文件myid,输入内容为1

echo "1" >> data/myid

// 修改zookeeper的日志输出路径(如果zkEnv.sh文件中已存在ZOO_LOG_DIRZOO_LOG4J_PROP,则先删除再添加

vi libexec/zkEnv.sh

if [ "x${ZOO_LOG_DIR}" = "x" ]

then

   ZOO_LOG_DIR="$ZOOKEEPER_HOME/logs"

fi

if [ "x${ZOO_LOG4J_PROP}" = "x" ]

then

   ZOO_LOG4J_PROP="INFO,ROLLINGFILE"

fi

// 修改zookeeper的日志配置文件

vi conf/log4j.properties

zookeeper.root.logger=INFO,ROLLINGFILE

// 创建日志目录

mkdir logs

将hadoop-slave1节点上的Zookeeper目录同步到hadoop-slave2和hadoop-slave3节点,并修改Zookeeper的数据文件。此外,不要忘记设置用户环境变量

// 在hadoop-slave1中将zookeeper目录复制到其它节点

scp -r /opt/founder/zookeeper-3.4.5-cdh5.7.6 hadoop-slave2:/opt/founder

scp -r /opt/founder/zookeeper-3.4.5-cdh5.7.6hadoop-slave3:/opt/founder

//在hadoop-slave2中修改data目录中的myid文件

echo "2">/opt/founder/zookeeper-3.4.5-cdh5.7.6/data/myid

//在hadoop-slave3中修改data目录中的myid文件

echo "3" >/opt/founder/zookeeper-3.4.5-cdh5.7.6/data/myid

最后,在安装了Zookeeper的各节点上启动Zookeeper,并查看节点状态。

zookeeper相关指令如下(按需执行):

// 启动

zkServer.sh start

// 查看状态

zkServer.sh status

// 关闭

zkServer.sh stop

 

6 Hadoop HA配置

//使用founder用户

su founder

// 在hadoop-master1节点解压hadoop安装包

tar-xvf hadoop-2.9.0.tar.gz -C /opt/founder

// 删除安装包

rm hadoop-2.9.0.tar.gz

6.1 配置hadoop-env.sh文件

cd /opt/founder/hadoop-2.9.0/etc/hadoop

vi hadoop-env.sh

export JAVA_HOME=/opt/founder/jdk1.8.0_73

 

6.2 配置core-site.xml文件

vi core-site.xml

 

删除文件中原有<configuration></configuration>节点,添加如下内容:

<configuration>

  <property>

    <name>fs.defaultFS</name>

    <value>hdfs://mycluster</value>

  </property>

  <property>

       <name>ipc.client.connect.max.retries</name>

       <value>100</value>

 </property>

 <property>

  <name>ipc.client.connect.retry.interval</name>

  <value>10000</value>

 </property>

  <property>

    <name>hadoop.tmp.dir</name>

    <value>/opt/founder/hadoop-2.9.0/data/tmp</value>

 </property>

  <property>

    <name>fs.trash.interval</name>

    <value>1440</value>

  </property>

  <property>

    <name>ha.zookeeper.quorum</name>

    <value>hadoop-slave1:2181,hadoop-slave2:2181,hadoop-slave3:2181</value>

  </property>

</configuration>

6.3 配置hdfs-site.xml文件

vi hdfs-site.xml

 

删除文件中原有<configuration></configuration>节点,添加如下内容:

<configuration>

  <property>

    <name>dfs.namenode.name.dir</name>

    <value>file:/opt/founder/hadoop-2.9.0/data/namenode</value>

  </property>

  <property>

    <name>dfs.datanode.data.dir</name>

    <value>file:/opt/founder/hadoop-2.9.0/data/datanode</value>

  </property>

  <property>

    <name>dfs.replication</name>

    <value>3</value>

  </property>

  <property>

    <name>dfs.permissions.enabled</name>

    <value>false</value>

  </property>

  <property>

    <name>dfs.webhdfs.enabled</name>

    <value>true</value>

  </property>

  <property>

    <name>dfs.nameservices</name>

    <value>mycluster</value>

  </property>

<property>

    <name>dfs.ha.namenodes.mycluster</name>

    <value>nn1,nn2</value>

  </property>

  <property>

    <name>dfs.namenode.rpc-address.mycluster.nn1</name>

    <value>hadoop-master1:8020</value>

  </property>

  <property>

    <name>dfs.namenode.rpc-address.mycluster.nn2</name>

    <value>hadoop-master2:8020</value>

  </property>

  <property>

    <name>dfs.namenode.http-address.mycluster.nn1</name>

    <value>hadoop-master1:50070</value>

  </property>

  <property>

    <name>dfs.namenode.http-address.mycluster.nn2</name>

    <value>hadoop-master2:50070</value>

  </property>

  <property>

    <name>dfs.namenode.shared.edits.dir</name>

    <value>qjournal://hadoop-slave1:8485;hadoop-slave2:8485;hadoop-slave3:8485/mycluster</value>

  </property>

  <property>

    <name>dfs.journalnode.edits.dir</name>

    <value>/opt/founder/hadoop-2.9.0/data/journal</value>

  </property>

<property>

    <name>dfs.client.failover.proxy.provider.mycluster</name>

    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

  </property>

  <property>

    <name>dfs.ha.fencing.methods</name>

    <value>sshfence</value>

  </property>

  <property>

    <name>dfs.ha.fencing.ssh.private-key-files</name>

    <value>/home/founder/.ssh/id_rsa</value>

  </property>

  <property>

    <name>dfs.ha.automatic-failover.enabled</name>

    <value>true</value>

  </property>

</configuration>

6.4 配置mapred-site.xml文件

cp/opt/founder/hadoop-2.9.0/etc/hadoop/mapred-site.xml.template/opt/founder/hadoop-2.9.0/etc/hadoop/mapred-site.xml

 

// 编辑

vi mapred-site.xml

 

删除文件中原有<configuration></configuration>节点,添加如下内容:

<configuration>

  <property>

    <name>mapreduce.framework.name</name>

    <value>yarn</value>

  </property>

  <property>

    <name>mapreduce.jobhistory.address</name>

    <value>hadoop-master1:10020</value>

  </property>

  <property>

    <name>mapreduce.jobhistory.webapp.address</name>

    <value>hadoop-master1:19888</value>

  </property>

  <property>

    <name>mapreduce.job.ubertask.enable</name>

    <value>true</value>

  </property>

  <property>

    <name>mapreduce.job.ubertask.maxmaps</name>

    <value>9</value>

  </property>

  <property>

    <name>mapreduce.job.ubertask.maxreduces</name>

    <value>1</value>

  </property>

</configuration>

6.5 配置yarn-site.xml文件

vi yarn-site.xml

 

删除文件中原有<configuration></configuration>节点,添加如下内容:

<configuration>

  <property>

   <name>yarn.nodemanager.aux-services</name>

   <value>mapreduce_shuffle</value>

  </property>

  <property>

   <name>yarn.web-proxy.address</name>

    <value>hadoop-master2:8888</value>

  </property>

  <property>

   <name>yarn.log-aggregation-enable</name>

   <value>true</value>

  </property>

  <property>

   <name>yarn.log-aggregation.retain-seconds</name>

   <value>604800</value>

  </property>

  <property>

   <name>yarn.nodemanager.remote-app-log-dir</name>

   <value>/logs</value>

  </property>

  <property>

   <name>yarn.nodemanager.resource.memory-mb</name>

   <value>2048</value>

  </property>

  <property>

   <name>yarn.nodemanager.resource.cpu-vcores</name>

   <value>2</value>

  </property>

  <property>

   <name>yarn.resourcemanager.ha.enabled</name>

   <value>true</value>

  </property>

  <property>

   <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>

   <value>true</value>

  </property>

  <property>

   <name>yarn.resourcemanager.cluster-id</name>

   <value>yarncluster</value>

  </property>

  <property>

   <name>yarn.resourcemanager.ha.rm-ids</name>

   <value>rm1,rm2</value>

  </property>

  <property>

   <name>yarn.resourcemanager.hostname.rm1</name>

   <value>hadoop-master1</value>

  </property>

  <property>

   <name>yarn.resourcemanager.hostname.rm2</name>

   <value>hadoop-master2</value>

  </property>

  <property>

   <name>yarn.resourcemanager.webapp.address.rm1</name>

   <value>hadoop-master1:8088</value>

  </property>

  <property>

   <name>yarn.resourcemanager.webapp.address.rm2</name>

   <value>hadoop-master2:8088</value>

  </property>

  <property>

   <name>yarn.resourcemanager.zk-address</name>

    <value>hadoop-slave1:2181,hadoop-slave2:2181,hadoop-slave3:2181</value>

  </property>

  <property>

   <name>yarn.resourcemanager.zk-state-store.parent-path</name>

   <value>/rmstore</value>

  </property>

  <property>

   <name>yarn.resourcemanager.recovery.enabled</name>

   <value>true</value>

  </property>

  <property>

   <name>yarn.resourcemanager.store.class</name>

   <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>

  </property>

  <property>

   <name>yarn.nodemanager.recovery.enabled</name>

   <value>true</value>

  </property>

  <property>

   <name>yarn.nodemanager.address</name>

   <value>0.0.0.0:45454</value>

  </property>

</configuration>

6.6 配置slaves文件

vi slaves

 

删除文件中其他内容,添加如下内容:

hadoop-slave1

hadoop-slave2

hadoop-slave3

 

6.7 创建配置文件中涉及的目录

cd /opt/founder/hadoop-2.9.0

mkdir -p data/tmp

mkdir -p data/journal

mkdir -p data/namenode

mkdir -p data/datanode

 

 

 

6.8 将hadoop配置同步到集群其它节点

scp -r/opt/founder/hadoop-2.9.0/ hadoop-master2:/opt/founder/

scp -r/opt/founder/hadoop-2.9.0/ hadoop-slave1:/opt/founder/

scp -r/opt/founder/hadoop-2.9.0/ hadoop-slave2:/opt/founder/

scp -r/opt/founder/hadoop-2.9.0/ hadoop-slave3:/opt/founder/

 

// 在集群各节点上修改用户环境变量

vi /home/founder.bash_profile

export HADOOP_HOME=/opt/founder/hadoop-2.9.0/

export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native

export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

// 使修改的环境变量生效

source .bash_profile

 

7 Hadoop集群的初始化

// 使用founder用户

su founder

// 启动zookeeper集群(分别在slave1、slave2和slave3上执行)

zkServer.shstart

// 格式化ZKFC(在master1上执行)

hdfszkfc -formatZK

// 启动journalnode(分别在slave1、slave2和slave3上执行)

hadoop-daemon.sh start journalnode

// 格式化HDFS(在master1上执行)

hdfsnamenode -format

// 将格式化后master1节点hadoop工作目录中的元数据目录复制到master2节点

scp -rapp/cdh/hadoop-2.6.0-cdh5.7.1/data/namenode/* hadoop-master2:/home/hadoop/app/cdh/hadoop-2.6.0-cdh5.7.1/data/namenode/

// 初始化完毕后可关闭journalnode(分别在slave1、slave2和slave3上执行)

hadoop-daemon.sh stop journalnode

 

8 Hadoop集群的启动

8.1 集群启动步骤

// 启动zookeeper集群(分别在slave1、slave2和slave3执行)

zkServer.shstart

// 启动HDFS(在master1执行)

start-dfs.sh

备注:此命令分别在master1/master2节点启动了NameNode和ZKFC,分别在slave1/slave2/slave3节点启动了DataNode和JournalNode,如下图所示

// 启动YARN(在master2执行)

start-yarn.sh

备注:此命令在master2节点启动了ResourceManager,分别在slave1/slave2/slave3节点启动了NodeManager。

 

// 启动YARN的另一个ResourceManager(在master1执行,用于容灾)

yarn-daemon.sh start resourcemanager

// 启动YARN的安全代理(在master2执行)

yarn-daemon.sh start proxyserver

备注:proxyserver充当防火墙的角色,可以提高访问集群的安全性

// 启动YARN的历史任务服务(在master1执行)

mr-jobhistory-daemon.sh starthistoryserver

 

 

8.2 集群启动截图

hadoop-master1开启了NameNode、ResourceManager、HistoryServer和ZKFC,如下图所示:

hadoop-master2开启了NameNode、ResourceManager、ProxyServer和ZKFC,如下图所示:


hadoop-slave1、hadoop-slave2和hadoop-slave3分别开启了DataNode、JournalNode、NodeManager和ZooKeeper,如下图所示:



  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值