本文参考:
http://blog.csdn.net/carl810224/article/details/52160418
http://blog.csdn.net/Dr_Guo/article/details/50975851
1.准备文件
操作系统:CentOS Linux release 7.0.1406
JDK:Java(TM) SE Runtime Environment (build1.8.0_73-b02)
Hadoop:hadoop-2.9.0.tar.gz
ZooKeeper:zookeeper-3.4.5-cdh5.7.6.tar.gz
2.集群框架
3.服务器列表
主机名 | 操作系统 | IP地址 | 安装的软件 | JPS |
hadoop-master1 | Red Hat Enterprise Linux Server release 7.2 (Maipo) | 172.18.98.238 | Jdk/hadoop | Namenode/zkfc/resourcemanager/ JobHistoryServer |
hadoop-master2 | Red Hat Enterprise Linux Server release 7.2 (Maipo) | 172.18.98.223 | Jdk/hadoop | Namenode/zkfc/resourcemanager/ WebProxyServer |
hadoop-slave1 | Red Hat Enterprise Linux Server release 7.2 (Maipo) | 172.18.98.239 | Jdk/hadoop/zookeeper | Datanode/journalnode/nodemanager/ quorumPeerMain |
hadoop-slave2 | Red Hat Enterprise Linux Server release 7.2 (Maipo) | 172.18.98.240 | Jdk/hadoop/zookeeper | Datanode/journalnode/nodemanager/ quorumPeerMain |
hadoop-slave3 | Red Hat Enterprise Linux Server release 7.2 (Maipo) | 172.18.98.241 | Jdk/hadoop/zookeeper | Datanode/journalnode/nodemanager/ quorumPeerMain |
4.Linux环境准备
集群各节点进行如下修改配置
4.1 创建用户并添加权限
// 切换root用户
su root
// 创建founder用户组
groupadd founder
// 在founder用户组中创建founder用户
useradd -g founder founder
// 修改用户founder密码
passwd founder
// 修改sudoers配置文件给founder用户添加sudo权限
vi /etc/sudoers
founder ALL=(ALL) ALL
// 测试是否添加权限成功
exit
sudo ls /root
显示内容anaconda-ks.cfg(待确认是否成功)
4.2 修改主机名
// 切换root用户
su root
// 修改主机名
hostnamectl set-hostname XXX (XXX为主机名)
五台分别为hadoop-master1、hadoop-master2、hadoop-slave1、hadoop-slave2、hadoop-slave3
// 重启机器
reboot
// 查看主机名
hostname
4.3 修改hosts
// 切换root用户
su root
// 编辑hosts文件
vi /etc/hosts
172.18.98.238 hadoop-master1
172.18.98.223 hadoop-master2
172.18.98.239 hadoop-slave1
172.18.98.240 hadoop-slave2
172.18.98.241 hadoop-slave3
//重启机器
reboot
4.4 关闭防火墙
// 切换root用户
su root
// 停止firewall防火墙
systemctl stop firewalld.service
// 禁止firewall开机启动
systemctl disable firewalld.service
4.5 配置SSH免密码登录
(先把之前的步骤在五台机器都执行后再进行这一步)
生成密钥 使用founder用户
su founder
在hadoop-master1 的/home/founder下执行ssh-keygen -trsa 生成密钥,一路enter,成功后如下图所示:
在hadoop-master1 的/home/founder/.ssh 下分别执行以下命令:
ssh-copy-id -i ~/.ssh/id_rsa.pub founder@hadoop-master2
ssh-copy-id -i ~/.ssh/id_rsa.pub founder@hadoop-slave1
ssh-copy-id -i ~/.ssh/id_rsa.pub founder@hadoop-slave2
ssh-copy-id -i ~/.ssh/id_rsa.pub founder@hadoop-slave3
在hadoop-master2 的/home/founder下执行ssh-keygen -trsa 生成密钥
在hadoop-master2 的/home/founder/.ssh 下执行
ssh-copy-id -i ~/.ssh/id_rsa.pub founder@hadoop-master1
ssh-copy-id -i ~/.ssh/id_rsa.pub founder@hadoop-slave1
ssh-copy-id -i ~/.ssh/id_rsa.pub founder@hadoop-slave2
ssh-copy-id -i ~/.ssh/id_rsa.pub founder@hadoop-slave3
配置完成之后,在hadoop-master1上执行
ssh hadoop-slave2
将不需要密码直接登录slave2的founder账号,
exit 可退出
4.6 安装JDK
// 使用root用户新建目录
cd /opt
mkdir founder
chown -R founder:founder founder
// 将jdk文件夹拷贝到/opt/founder路径下,并配置相关环境变量
vi /etc/profile 在文件最后添加
export JAVA_HOME=/opt/founder/jdk1.8.0_73
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
保存后退出
source /etc/profile
// Jdk设置普通用户可以访问
所有机器范围、jdk版本1.8.0_73
使用root用户,执行 sudochmod -R 755 /opt/founder
然后切换到founder用户,执行 java–version
无权限
有权限
5 Zookeeper集群安装
Zookeeper是一个开源分布式协调服务,其独特的Leader-Follower集群结构,很好的解决了分布式单点问题。目前主要用于诸如:统一命名服务、配置管理、锁服务、集群管理等场景。大数据应用中主要使用Zookeeper的集群管理功能。
本集群使用zookeeper-3.4.5-cdh5.7.1版本。首先在hadoop-slave1节点安装Zookeeper,方法如下:
// 使用founder用户
su founder
// 解压zookeeper安装包
tar -xvf zookeeper-3.4.5-cdh5.7.6.tar.gz -C /opt/fouder/
// 删除安装包
rm -rf zookeeper-3.4.5-cdh5.7.6.tar.gz
// 配置founder用户环境变量
vi /home/founder/.bash_profile
export ZOOKEEPER_HOME=/opt/founder/zookeeper-3.4.5-cdh5.7.6
export PATH=$PATH:$ZOOKEEPER_HOME/bin
// 使修改的环境变量生效
source /home/founder/.bash_profile
// 修改zookeeper的配置文件
cd /opt/founder/zookeeper-3.4.5-cdh5.7.6/conf/
cp zoo_sample.cfg zoo.cfg
vi zoo.cfg
# 客户端心跳时间(毫秒)
tickTime=2000
# 允许心跳间隔的最大时间
initLimit=10
# 同步时限
syncLimit=5
# 数据存储目录
dataDir=/opt/founder/zookeeper-3.4.5-cdh5.7.6/data
# 数据日志存储目录
dataLogDir=/opt/founder/zookeeper-3.4.5-cdh5.7.6/data/log
# 端口号
clientPort=2181
# 集群节点和服务端口配置
server.1=hadoop-slave1:2888:3888
server.2=hadoop-slave2:2888:3888
server.3=hadoop-slave3:2888:3888
# 以下为优化配置
# 服务器最大连接数,默认为10,改为0表示无限制
maxClientCnxns=0
# 快照数
autopurge.snapRetainCount=3
# 快照清理时间,默认为0
autopurge.purgeInterval=1
// 创建zookeeper的数据存储目录和日志存储目录
cd /opt/founder/zookeeper-3.4.5-cdh5.7.6
mkdir -p data/log
// 在data目录中创建一个文件myid,输入内容为1
echo "1" >> data/myid
// 修改zookeeper的日志输出路径(如果zkEnv.sh文件中已存在ZOO_LOG_DIR和ZOO_LOG4J_PROP,则先删除再添加)
vi libexec/zkEnv.sh
if [ "x${ZOO_LOG_DIR}" = "x" ]
then
ZOO_LOG_DIR="$ZOOKEEPER_HOME/logs"
fi
if [ "x${ZOO_LOG4J_PROP}" = "x" ]
then
ZOO_LOG4J_PROP="INFO,ROLLINGFILE"
fi
// 修改zookeeper的日志配置文件
vi conf/log4j.properties
zookeeper.root.logger=INFO,ROLLINGFILE
// 创建日志目录
mkdir logs
将hadoop-slave1节点上的Zookeeper目录同步到hadoop-slave2和hadoop-slave3节点,并修改Zookeeper的数据文件。此外,不要忘记设置用户环境变量。
// 在hadoop-slave1中将zookeeper目录复制到其它节点
scp -r /opt/founder/zookeeper-3.4.5-cdh5.7.6 hadoop-slave2:/opt/founder
scp -r /opt/founder/zookeeper-3.4.5-cdh5.7.6hadoop-slave3:/opt/founder
//在hadoop-slave2中修改data目录中的myid文件
echo "2">/opt/founder/zookeeper-3.4.5-cdh5.7.6/data/myid
//在hadoop-slave3中修改data目录中的myid文件
echo "3" >/opt/founder/zookeeper-3.4.5-cdh5.7.6/data/myid
最后,在安装了Zookeeper的各节点上启动Zookeeper,并查看节点状态。
zookeeper相关指令如下(按需执行):
// 启动
zkServer.sh start
// 查看状态
zkServer.sh status
// 关闭
zkServer.sh stop
6 Hadoop HA配置
//使用founder用户
su founder
// 在hadoop-master1节点解压hadoop安装包
tar-xvf hadoop-2.9.0.tar.gz -C /opt/founder
// 删除安装包
rm hadoop-2.9.0.tar.gz
6.1 配置hadoop-env.sh文件
cd /opt/founder/hadoop-2.9.0/etc/hadoop
vi hadoop-env.sh
export JAVA_HOME=/opt/founder/jdk1.8.0_73
6.2 配置core-site.xml文件
vi core-site.xml
删除文件中原有<configuration></configuration>节点,添加如下内容:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>ipc.client.connect.max.retries</name>
<value>100</value>
</property>
<property>
<name>ipc.client.connect.retry.interval</name>
<value>10000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/founder/hadoop-2.9.0/data/tmp</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>1440</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop-slave1:2181,hadoop-slave2:2181,hadoop-slave3:2181</value>
</property>
</configuration>
6.3 配置hdfs-site.xml文件
vi hdfs-site.xml
删除文件中原有<configuration></configuration>节点,添加如下内容:
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/opt/founder/hadoop-2.9.0/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/opt/founder/hadoop-2.9.0/data/datanode</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>hadoop-master1:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>hadoop-master2:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>hadoop-master1:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>hadoop-master2:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop-slave1:8485;hadoop-slave2:8485;hadoop-slave3:8485/mycluster</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/founder/hadoop-2.9.0/data/journal</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/founder/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
</configuration>
6.4 配置mapred-site.xml文件
cp/opt/founder/hadoop-2.9.0/etc/hadoop/mapred-site.xml.template/opt/founder/hadoop-2.9.0/etc/hadoop/mapred-site.xml
// 编辑
vi mapred-site.xml
删除文件中原有<configuration></configuration>节点,添加如下内容:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop-master1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop-master1:19888</value>
</property>
<property>
<name>mapreduce.job.ubertask.enable</name>
<value>true</value>
</property>
<property>
<name>mapreduce.job.ubertask.maxmaps</name>
<value>9</value>
</property>
<property>
<name>mapreduce.job.ubertask.maxreduces</name>
<value>1</value>
</property>
</configuration>
6.5 配置yarn-site.xml文件
vi yarn-site.xml
删除文件中原有<configuration></configuration>节点,添加如下内容:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.web-proxy.address</name>
<value>hadoop-master2:8888</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/logs</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>2</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>yarncluster</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>hadoop-master1</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>hadoop-master2</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>hadoop-master1:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>hadoop-master2:8088</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>hadoop-slave1:2181,hadoop-slave2:2181,hadoop-slave3:2181</value>
</property>
<property>
<name>yarn.resourcemanager.zk-state-store.parent-path</name>
<value>/rmstore</value>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<property>
<name>yarn.nodemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.nodemanager.address</name>
<value>0.0.0.0:45454</value>
</property>
</configuration>
6.6 配置slaves文件
vi slaves
删除文件中其他内容,添加如下内容:
hadoop-slave1
hadoop-slave2
hadoop-slave3
6.7 创建配置文件中涉及的目录
cd /opt/founder/hadoop-2.9.0
mkdir -p data/tmp
mkdir -p data/journal
mkdir -p data/namenode
mkdir -p data/datanode
6.8 将hadoop配置同步到集群其它节点
scp -r/opt/founder/hadoop-2.9.0/ hadoop-master2:/opt/founder/
scp -r/opt/founder/hadoop-2.9.0/ hadoop-slave1:/opt/founder/
scp -r/opt/founder/hadoop-2.9.0/ hadoop-slave2:/opt/founder/
scp -r/opt/founder/hadoop-2.9.0/ hadoop-slave3:/opt/founder/
// 在集群各节点上修改用户环境变量
vi /home/founder.bash_profile
export HADOOP_HOME=/opt/founder/hadoop-2.9.0/
export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
// 使修改的环境变量生效
source .bash_profile
7 Hadoop集群的初始化
// 使用founder用户
su founder
// 启动zookeeper集群(分别在slave1、slave2和slave3上执行)
zkServer.shstart
// 格式化ZKFC(在master1上执行)
hdfszkfc -formatZK
// 启动journalnode(分别在slave1、slave2和slave3上执行)
hadoop-daemon.sh start journalnode
// 格式化HDFS(在master1上执行)
hdfsnamenode -format
// 将格式化后master1节点hadoop工作目录中的元数据目录复制到master2节点
scp -rapp/cdh/hadoop-2.6.0-cdh5.7.1/data/namenode/* hadoop-master2:/home/hadoop/app/cdh/hadoop-2.6.0-cdh5.7.1/data/namenode/
// 初始化完毕后可关闭journalnode(分别在slave1、slave2和slave3上执行)
hadoop-daemon.sh stop journalnode
8 Hadoop集群的启动
8.1 集群启动步骤
// 启动zookeeper集群(分别在slave1、slave2和slave3执行)
zkServer.shstart
// 启动HDFS(在master1执行)
start-dfs.sh
备注:此命令分别在master1/master2节点启动了NameNode和ZKFC,分别在slave1/slave2/slave3节点启动了DataNode和JournalNode,如下图所示
// 启动YARN(在master2执行)
start-yarn.sh
备注:此命令在master2节点启动了ResourceManager,分别在slave1/slave2/slave3节点启动了NodeManager。
// 启动YARN的另一个ResourceManager(在master1执行,用于容灾)
yarn-daemon.sh start resourcemanager
// 启动YARN的安全代理(在master2执行)
yarn-daemon.sh start proxyserver
备注:proxyserver充当防火墙的角色,可以提高访问集群的安全性
// 启动YARN的历史任务服务(在master1执行)
mr-jobhistory-daemon.sh starthistoryserver
8.2 集群启动截图
hadoop-master1开启了NameNode、ResourceManager、HistoryServer和ZKFC,如下图所示:
hadoop-master2开启了NameNode、ResourceManager、ProxyServer和ZKFC,如下图所示:
hadoop-slave1、hadoop-slave2和hadoop-slave3分别开启了DataNode、JournalNode、NodeManager和ZooKeeper,如下图所示: