hadoop平台搭建
任务分配:
master(第1台虚拟机)
master(第1台虚拟机)
主机名 | 运行进程 |
---|---|
master(第1台虚拟机) | NodeManager QuorumPeerMain NameNode 1JournalNode ResourceManager JobHistoryServer DataNode Jps DFSZKFailoverController |
slave1(第2台虚拟机) | QuorumPeerMain JournalNode NameNode ResourceManager DFSZKFailoverController NodeManager DataNode Jps |
slave2(第3台虚拟机) | DataNode NodeManager Jps QuorumPeerMain JournalNode |
hadoop-HA的搭建
关闭防火墙
不关闭的话zookeeper运行会出错
systemctl stop firewalld 临时关闭
systemctl disable firewalld 永久关闭
修改主机名:
査询主机名:hostname
修改主机名:hostnamectl set-hostname master(master为修改的主机名)
(master–>slave1,slave2)
配制ssh免密:
配制映射vim /etc/hosts
写入ip地址+主机名
192.168.253.138 master
192.168.253.139 slave1
192.168.253.140 slave2
生成密钥:
ssh-keygen #三次回车
ssh-copy-id master
ssh-copy-id slave1
ssh-copy-id slave2
测试:
ssh master
master–>slave1,slave2
记的用exit登出
xshell(方便操作,可以不用):
安装jdk
下载jdk(根据自已的情况):wget https://dl.cactifans.com/jdk/jdk-8u101-linux-x64.tar.gz
创建mkdir /apps
存放解压后的包
将下载的包解压到tar -zxvf jdk-8u101-linux-x64.tar.gz -C /apps
改名mv jdk1.8.0_101 java
(可有可无,但改名方便后续操作)
配置环境变量:vim /etc/profile
往/etc/profile內的写入的内容
export JAVA_HOME=/apps/java #jdk位置
export PATH=$JAVA_HOME/bin:$PATH
刷新环境变量source /etc/profile
java的测试
java -version
用scp将master的文件传到slave1,slave2
scp -r /apps/java slave1:/apps
scp -r /apps/java slave2:/apps
scp -r /etc/profile slave1:/etc
scp -r /etc/profile slave2:/etc
刷新环境变量source /etc/profile
安装zookeeper
我没有地址,只有包
将下载的包解压到/appstar -zxvf zookeeper-3.4.14.tar.gz -C /apps
改名mv zookeeper-3.4.14 zookeeper
添加环境变量:vim /etc/profile
export ZOOKEEPER_HOME=/apps/zookeeper
export PATH=$ZOOKEEPER_HOME/bin:$PATH
刷新环境变量source /etc/profile
zookeeper文件配置和修改:
cp /apps/zookeeper/conf/zoo_sample.cfg /apps/zookeeper/conf/zoo.cfg
#你自已的zookeeper的conf目录
vim /apps/zookeeper/conf/zoo.cfg
修改:
在文章末写入
server.1=master:2888:3888
server.2=slave1:2888:3888
server.3=slave2:2888:3888
server固定格式:标识是zookeeper集群中的一台机器
1/2/3…标识zookeeper集群中对应机器ID,
master/slave1/slave2…标识zookeeper集群中对应集群中节点的名称
2888和3888分别标识zookeeper集群中进行选举和同步数据的端口
在/apps/zookeeper/目录下创建tmp目录,在tmp目录下创建一个空文件myid,然后在myid文件中写入对应机器的id
mkdir /apps/zookeeper/tmp
vim /apps/zookeeper/tmp/myid
用scp将master的文件传到slave1,slave2
scp -r /apps/zookeeper slave1:/apps
scp -r /apps/zookeeper slave2:/apps
修改slave1,slave2的myid
scp -r /etc/profile slave1:/etc
scp -r /etc/profile slave2:/etc
刷新环境变量source /etc/profile
zookeeper的启动:
zkServer.sh start
査看状态:
zkServer.sh status
两台follower,一台leader
安装hadoop
将下载的包解压到/appstar -zxvf hadoop-2.7.5.tar.gz -C /apps
改名mv hadoop-2.7.5 hadoop
添加环境变量:vim /etc/profile
export HADOOP_HOME=/apps/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
记得刷新环境变量
创建临时数据存放目录
mkdir -p /apps/hadoop-repo/tmp
切换到/apps/hadoop/etc/hadoop
(自个hadoop下的etc下的hadoop)
修改hadoop-env.sh
export JAVA_HOME=/apps/java
mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/history</value>
</property>
<property>
<name>mapreduce.map.log.level</name>
<value>INFO</value>
</property>
<property>
<name>mapreduce.reduce.log.level</name>
<value>INFO</value>
</property>
yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>yrc</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2,rm3</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>master</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>slave1</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm3</name>
<value>slave2</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>master:2181,slave1:2181,slave2:2181</value>
</property>
hdfs-site.xml
<property>
<name>dfs.nameservices</name>
<value>ns1</value>
</property>
<property>
<name>dfs.ha.namenodes.ns1</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn1</name>
<value>master:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1.nn1</name>
<value>master:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn2</name>
<value>slave1:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1.nn2</name>
<value>slave1:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://master:8485;slave1:8485;slave2:8485/ns1</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/apps/hadoop-repo/journal</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.ns1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://ns1</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/apps/hadoop-repo/tmp</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>master:2181,slave1:2181,slave2:2181</value>
</property>
slaves
master
slave1
slave2
传给slave1,slave2
scp -r /apps/hadoop slave1:/apps
scp -r /apps/hadoop slave2:/apps
scp -r /etc/profile slave1:/etc
scp -r /etc/profile slave2:/etc
刷新环境变量source /etc/profile
hadoop-HA的初始化
切换到cd /apps/hadoop/sbin
下运行命令
启动JournalNode(master,slave1,slave2):
hadoop-daemons.sh start journalnode
査看结果jps
启动historyserver
cd /apps/hadoop/sbin
运行
./mr-jobhistory-daemon.sh start historyserver
格式化NameNode:hadoop namenode -format
格式化会根据core-site.xml中的hadoop.tmp.dir生成一个dfs,传到slave1(第二台)scp -r dfs slave1:/apps/hadoop-repo/tmp
格式化Zookeeperhdfs zkfc -formatZK
hadoop-HA的启动/停止
启动过程:
启动zookeeper(所有节点)zkServer.sh start
启动HDFSstart-dfs.sh
启动YARNstart-yarn.sh
关闭流程:
停止YARN:stop-yarn.sh
停止DFS:stop-dfs.sh
停止Zookeeper:zkServer.sh stop