一、hadoop环境安装
【1】创建hadoop用户并切换到hadoop用户
[root@server1 ~]# useradd hadoop
[root@server1 ~]# id hadoop
uid=500(hadoop) gid=500(hadoop) groups=500(hadoop)
[root@server1 ~]# su - hadoop
【2】下载hadoop和jdk并解压
注:两个软件包都放到hadoop家目录底下。
[hadoop@server1 ~]$ tar zxf hadoop-2.7.3.tar.gz
[hadoop@server1 ~]$ tar zxf jdk-7u79-linux-x64.tar.gz
注:为了方便这里做软连接,方便配置。
[hadoop@server1 ~]$ ln -s jdk1.7.0_79/ jdk
[hadoop@server1 ~]$ ln -s hadoop-2.7.3 hadoop
【3】配置hadoop环境变量
[hadoop@server1 hadoop]$ pwd
/home/hadoop/hadoop/etc/hadoop
[hadoop@server1 hadoop]$ vim hadoop-env.sh
24 # The java implementation to use.
25 export JAVA_HOME=/home/hadoop/jdk
[hadoop@server1 ~]$ cat /etc/hosts
172.25.37.1 server1
注:必须有上面那一项,要不然运行会报错,IP为你hadoop服务器IP
[hadoop@server1 ~]$ cat .bash_profile
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
PATH=$PATH:$HOME/bin:/home/hadoop/jdk/bin
export PATH
【4】第一次启动hadoop
[hadoop@server1 hadoop]$ cd /home/hadoop/hadoop/bin/
[hadoop@server1 bin]$ ./hadoop
注:执行脚本启动,类似与初始化,看是否报错。
[hadoop@server1 ~]$ cd hadoop
[hadoop@server1 hadoop]$ mkdir input
[hadoop@server1 hadoop]$ cp etc/hadoop/*.xml input/
运行hadoop自带的mapreduce Demo
注:MapReduce是一种编程模型,用于大规模数据集(大于1TB)的并行运算。
[hadoop@server1hadoop]$ bin/hadoop jar \
share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar \ grep input output 'dfs[a-z.]+'
查看输出文件
[hadoop@server1 hadoop]$ cat output/*
1 dfsadmin
[hadoop@server1 hadoop]$ pwd
/home/hadoop/hadoop
[hadoop@server1 hadoop]$ ls
bin etc include input lib libexec LICENSE.txt NOTICE.txt output README.txt sbin share
二、伪分布式构建
【1】配置core-site.xml文件
[hadoop@server1 ~]$ cd hadoop/etc/hadoop/
[hadoop@server1 hadoop]$ vim core-site.xml
17 <!-- Put site-specific property overrides in this file. -->
18
19 <configuration>
20 <property>
21 <name>fs.defaultFS</name>
22 <value>hdfs://172.25.37.1:9000</value>
23 </property>
24 </configuration>
注:fs.defaultFS参数配置的是HDFS的地址
【2】配置 hdfs-site.xml 文件
[hadoop@server1 hadoop]$ vim hdfs-site.xml
17 <!-- Put site-specific property overrides in this file. -->
18
19 <configuration>
20 <property>
21 <name>dfs.replication</name>
22 <value>1</value>
23 </property>
24 </configuration>
注:dfs.replication配置的是HDFS存储时的备份数量,因为这里是伪分布式环境只有一个节点,所以这里设置为1。
【3】配置ssh免密
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys
将下面文件中的localhost改为本机IP
[hadoop@server1 hadoop]$ vim slaves
172.25.37.1
【4】格式化文件系统
[hadoop@server1 hadoop]$ bin/hdfs namenode -format
注:格式化是对HDFS这个分布式文件系统中的DataNode进行分块,统计所有分块后的初始元数据的存储在NameNode中。
【5】启动HDFS
[hadoop@server1 hadoop]$ ./sbin/start-dfs.sh
Starting namenodes on [server1]
server1: namenode running as process 2496. Stop it first.
172.25.37.1: datanode running as process 2164. Stop it first.
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
RSA key fingerprint is 9b:19:24:43:5d:09:3a:12:97:94:99:f4:61:dc:3d:e2.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (RSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-secondarynamenode-server1.out
【6】查看进程,四个进程都启动表示成功
[hadoop@server1 hadoop]$ jps
3151 Jps
2164 DataNode
3042 SecondaryNameNode
2496 NameNode
查看端口:
[hadoop@server1 hadoop]$ netstat -antlp | grep 50070
tcp 0 0 0.0.0.0:50070
【7】浏览器测试
输入: 172.25.37.1:50070
查看datanode:
通过命令方式查看:
[hadoop@server1 hadoop]$ ./bin/hdfs dfsadmin -report
Configured Capacity: 14309232640 (13.33 GB)
Present Capacity: 11614142464 (10.82 GB)
DFS Remaining: 11614113792 (10.82 GB)
DFS Used: 28672 (28 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Live datanodes (1):
Name: 172.25.37.1:50010 (server1)
Hostname: server1
Decommission Status : Normal
Configured Capacity: 14309232640 (13.33 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 2695090176 (2.51 GB)
DFS Remaining: 11614113792 (10.82 GB)
DFS Used%: 0.00%
DFS Remaining%: 81.17%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu May 31 22:22:11 CST 2018
创建目录用于上传文件:
[hadoop@server1 hadoop]$ ./bin/hdfs dfs -mkdir /user
[hadoop@server1 hadoop]$ ./bin/hdfs dfs -mkdir /user/hadoop
[hadoop@server1 hadoop]$ bin/hdfs dfs -put ./input /user/hadoop/
[hadoop@server1 hadoop]$ ./bin/hdfs dfs -ls
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2018-05-31 22:24 input
[hadoop@server1 hadoop]$ ./bin/hdfs dfs -ls input
Found 29 items
-rw-r--r-- 1 hadoop supergroup 4436 2018-05-31 22:24 input/capacity-scheduler.xml
-rw-r--r-- 1 hadoop supergroup 1335 2018-05-31 22:24 input/configuration.xsl
篇幅原因这里只列出两项
删除input和output目录重新运行mapreduce
[hadoop@server1 hadoop]$ rm -fr input/ output
[hadoop@server1 hadoop]$ ./bin/hadoop jar \
share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount input output
[hadoop@server1 hadoop]$ ./bin/hdfs dfs -ls
Found 2 items
drwxr-xr-x - hadoop supergroup 0 2018-05-31 22:24 input
drwxr-xr-x - hadoop supergroup 0 2018-05-31 22:30 output
浏览器访问 http://172.25.20.1:50070/explorer.html,可以看到刚才创建的目录。
【8】读取HDFS上的文件内容
在hadoop目录执行
./bin/hdfs dfs -cat 你要查看的文件绝对路径。
【9】从HDFS上下载文件到本地
在hadoop目录执行
./bin/hdfs dfs -get 你要下载的文件目录。
待补充
三、完全分布式构建
环境:
物理机:rhel7.3 172.25.37.250/24 用于时间同步
Server1:rhel6.5 172.25.37.1/24
Server2:rhel6.5 172.25.37.2/24
Server3:rhel6.5 172.25.37.3/24
【1】时间同步
真机将同步源设为百度IP
[root@random etc]# vim /etc/chrony.conf
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
注:这里写你的同步源
server 119.75.213.61 iburst
# Allow NTP client access from local network.
注:这里写你允许172.25/16网段同步时间
allow 172.25/16
在三个虚拟都作下面配置:
[root@server1 ~]# yum install -y ntp
[root@server1 ~]# vim /etc/ntp.conf
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
注:将同步源设置为物理机
server 172.25.37.250 iburst
启动ntpd
[root@server1 ~]# /etc/init.d/ntpd start
设置开机自启动
[root@server1 ~]# chkconfig ntpd on
【2】配置ssh免密
[root@server1 ~]# su - hadoop
[hadoop@server1 ~]$ cd .ssh/
[hadoop@server1 .ssh]$ ls
authorized_keys id_rsa id_rsa.pub known_hosts
[hadoop@server1 .ssh]$ ssh-copy-id 172.25.37.2
[hadoop@server1 .ssh]$ ssh-copy-id 172.25.37.3
测试时,在server1上面执行下面的指令:
[hadoop@server1 ~]$ ssh 172.25.37.2
[hadoop@server1 ~]$ ssh 172.25.37.3
注:两次测试输入yes后回车,不需要再输入密码表示成功。
【3】在三台虚拟机上面均创建hadoop用户并且要求id 完全一致。
[hadoop@server1 ~]$ id hadoop
uid=500(hadoop) gid=500(hadoop) groups=500(hadoop)
[root@server2 ~]# useradd hadoop
[root@server2 ~]# id hadoop
uid=500(hadoop) gid=500(hadoop) groups=500(hadoop)
[root@server3 ~]# useradd hadoop
[root@server3 ~]# id hadoop
uid=500(hadoop) gid=500(hadoop) groups=500(hadoop)
【4】 nfs配置文件共享
(1)安装软件
[root@server1 ~]# yum install -y exportfs
(2)配置共享目录
[root@server1 ~]# vim /etc/exports
/home/hadoop *(rw,anonuid=500,anongid=500)
(3)查看共享目录信息
[root@server1 ~]# exportfs -rv
exporting *:/home/hadoop
(4)开启服务
[root@server1 ~]# /etc/init.d/rpcbind start ##先开 这个
Starting rpcbind: [ OK ]
[root@server1 ~]# /etc/init.d/nfs start
Starting NFS services: [ OK ]
Starting NFS mountd: [ OK ]
Starting NFS daemon: [ OK ]
Starting RPC idmapd: [ OK ]
(5)在server2和server3上面安装软件,并挂载共享目录
[root@server2 ~]# yum install -y exportfs
[root@server2 ~]# /etc/init.d/rpcbind start
Starting rpcbind: [ OK ]
[root@server3 ~]# yum install -y exportfs
[root@server3 ~]# /etc/init.d/rpcbind start
Starting rpcbind: [ OK ]
[root@server2 ~]# mount 172.25.37.1:/home/hadoop/ /home/hadoop/
[root@server2 ~]# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/VolGroup-lv_root 13973860 922768 12341256 7% /
tmpfs 510200 0 510200 0% /dev/shm
/dev/vda1 495844 33457 436787 8% /boot
172.25.20.1:/home/hadoop/ 13974016 1932544 11331584 15% /home/hadoop
[root@server3 ~]# mount 172.25.37.1:/home/hadoop/ /home/hadoop/
[root@server3 ~]# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/VolGroup-lv_root 13973860 922768 12341256 7% /
tmpfs 510200 0 510200 0% /dev/shm
/dev/vda1 495844 33457 436787 8% /boot
172.25.20.1:/home/hadoop/ 13974016 1932544 11331584 15% /home/hadoop
【4】在三台虚拟机上面的/etc/hosts解析要相同,我的文件内容如下:
[hadoop@server2 hadoop]$ cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
172.25.37.1 server1
172.25.37.2 server2
172.25.37.3 server3
清空之前测试时的文件
[root@server1 ~]# rm -fr /tmp/*
[root@server2 ~]# rm -fr /tmp/*
[root@server3 ~]# rm -fr /tmp/*
[root@server1 ~]# su - hadoop
[hadoop@server1 ~]$ cd hadoop
[hadoop@server1 hadoop]$ ls
bin etc include input lib libexec LICENSE.txt logs NOTICE.txt README.txt sbin share
注:如果之前测试时没有关闭hdfs服务,那么现在关闭。
[hadoop@server1 hadoop]$ sbin/stop-dfs.sh
【5】配置分布式,同样需要修改两个文件
[hadoop@server1 ~]$ cd hadoop/etc/hadoop/
[hadoop@server1 hadoop]$ vim core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://172.25.37.1:9000</value>
</property>
</configuration>
这里将集群数量改为2
[hadoop@server1 hadoop]$ vim hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
将slaves文件做如下配置,现在的datanode为server2和server3
[hadoop@server1 hadoop]$ vim slaves
172.25.37.2
172.25.37.3
[hadoop@server1 hadoop]$ pwd
/home/hadoop/hadoop
【6】初始化并开启hdfs
[hadoop@server1 hadoop]$ bin/hdfs namenode -format
[hadoop@server1 hadoop]$ sbin/start-dfs.sh
Starting namenodes on [server1]
server1: namenode running as process 3463. Stop it first.
172.25.37.3: datanode running as process 1202. Stop it first.
172.25.37.2: starting datanode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-datanode-server2.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: secondarynamenode running as process 3710. Stop it first.
【7】浏览器测试
你可以看到两个datanode
查看进程
[hadoop@server1 ~]$ jps
3463 NameNode
3710 SecondaryNameNode
4515 Jps
[root@server2 hadoop]# su - hadoop
[hadoop@server2 ~]$ jps
1255 DataNode
1415 Jps
[root@server3 ~]# su - hadoop
[hadoop@server3 ~]$ jps
1202 DataNode
1464 Jps
【8】扩容与缩容
通过更改配置文件hdfs-site.xml中value键值对中的数字以及values文件中的IP,
[hadoop@server1 hadoop]$ ./bin/hdfs dfsadmin -refreshNodes
【9】配置yarn
[hadoop@server1 hadoop]$ cp mapred-site.xml.template mapred-site.xml
[hadoop@server1 hadoop]$ pwd
/home/hadoop/hadoop/etc/hadoop
[hadoop@server1 hadoop]$ vim mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
[hadoop@server1 hadoop]$ vim yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
[hadoop@server1 hadoop]$ pwd
/home/hadoop/hadoop
启动yarn:
[hadoop@server1 hadoop]$ ./sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/hadoop-2.7.3/logs/yarn-hadoop-resourcemanager-server1.out
172.25.37.2: starting nodemanager, logging to /home/hadoop/hadoop-2.7.3/logs/yarn-hadoop-nodemanager-server2.out
172.25.37.3: starting nodemanager, logging to /home/hadoop/hadoop-2.7.3/logs/yarn-hadoop-nodemanager-server3.out
查看进程:
[hadoop@server1 hadoop]$ jps
3129 SecondaryNameNode
3375 ResourceManager
2939 NameNode
3632 Jps
[hadoop@server2 hadoop]$ jps
1687 DataNode
1832 NodeManager
1930 Jps
[hadoop@server3 hadoop]$ jps
1806 NodeManager
1904 Jps
1648 DataNode
注:如果server2和server3上面没有显示NodeManager,你可以在对应虚拟机上面执行:
[hadoop@server2 hadoop]$ ./sbin/yarn-daemon.sh start nodemanager
创建hadoop存档:
[hadoop@server1 hadoop]$ ./bin/hdfs dfs -mkdir /user
[hadoop@server1 hadoop]$ ./bin/hdfs dfs -mkdir /user/hadoop
[hadoop@server1 hadoop]$ mkdir input
[hadoop@server1 hadoop]$ cp etc/hadoop/*.xml input/
[hadoop@server1 hadoop]$ ./bin/hdfs dfs -put input
[hadoop@server1 hadoop]$ ./bin/hadoop archive -archiveName test.har -p /user/hadoop/ input/* input/
18/06/01 21:23:07 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
18/06/01 21:23:08 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
18/06/01 21:23:08 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
18/06/01 21:23:09 INFO mapreduce.JobSubmitter: number of splits:1
在浏览器上访问172.25.37.1:8088/cluster
四、用zookeeper实现hadoop高可用
环境:
rhel6.5 iptables stop && selinux disabled
Server1 :172.25.37.1/24
Server2 :172.25.37.2/24
Server3 :172.25.37.3/24
Server4 :172.25.37.4/24
Server5 :172.25.37.5/24
server1 && server5 --> HA
server 2、3、4 存储节点
【1】在虚拟机server4、server5上面配置免密登陆并添加hadoop用户
[root@server4 ~]# useradd hadoop
[root@server4 ~]# passwd hadoop
[root@server4 ~]# id hadoop
uid=500(hadoop) gid=500(hadoop) groups=500(hadoop)s
[root@server5 ~]# useradd hadoop
[root@server5 ~]# passwd hadoop
[root@server5 ~]# id hadoop
uid=500(hadoop) gid=500(hadoop) groups=500(hadoop)
[hadoop@server1 ~]$ cd .ssh/
[hadoop@server1 .ssh]$ ls
authorized_keys id_rsa id_rsa.pub known_hosts
[hadoop@server1 .ssh]$ ssh-copy-id 172.25.37.4
[hadoop@server1 .ssh]$ ssh-copy-id 172.25.37.5
【2】在server4、server5上面做时间同步
[root@server4 ~]# yum install -y ntp
[root@server4 ~]# vim /etc/ntp.conf
server 172.25.37.250 iburst
[root@server4 ~]# /etc/init.d/ntpd start
Starting ntpd: [ OK ]
[root@server5 ~]# yum install -y ntp
[root@server5 ~]# vim /etc/ntp.conf
server 172.25.37.250 iburst
[root@server5 ~]# /etc/init.d/ntpd start
Starting ntpd: [ OK ]
【3】在server4、server5上面通过nfs共享hadoop配置
[root@server4 ~]# yum install -y exportfs
[root@server4 ~]# /etc/init.d/rpcbind status
rpcbind (pid 1092) is running...
[root@server4 ~]# mount 172.25.37.1:/home/hadoop/ /home/hadoop/
[root@server4 ~]# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/VolGroup-lv_root 13973860 922772 12341252 7% /
tmpfs 510200 0 510200 0% /dev/shm
/dev/vda1 495844 33457 436787 8% /boot
172.25.37.1:/home/hadoop/ 13974016 1954560 11309568 15% /home/hadoop
[root@server5 ~]# yum install -y exportfs
[root@server5 ~]# /etc/init.d/rpcbind status
rpcbind (pid 1092) is running...
[root@server5 ~]# mount 172.25.37.1:/home/hadoop/ /home/hadoop/
[root@server5 ~]# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/VolGroup-lv_root 13973860 922760 12341264 7% /
tmpfs 510200 0 510200 0% /dev/shm
/dev/vda1 495844 33457 436787 8% /boot
172.25.20.1:/home/hadoop/ 13974016 1954560 11309568 15% /home/hadoop
【4】在所有虚拟机上面做如下解析:
[hadoop@server1 hadoop]$ cat /etc/hosts
172.25.37.1 server1
172.25.37.2 server2
172.25.37.3 server3
172.25.37.4 server4
172.25.37.5 server5
【5】清除之前实验产生的数据
[hadoop@server1 ~]$ cd hadoop
[hadoop@server1 hadoop]$ sbin/stop-yarn.sh
stopping yarn daemons
no resourcemanager to stop
172.25.37.3: no nodemanager to stop
172.25.37.2: no nodemanager to stop
no proxyserver to stop
[hadoop@server1 hadoop]$ sbin/stop-dfs.sh
Stopping namenodes on [server1]
server1: stopping namenode
172.25.37.3: stopping datanode
172.25.37.2: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
[hadoop@server1 hadoop]$
[hadoop@server1 ~]$ rm -fr /tmp/*
[hadoop@server2 ~]$ rm -fr /tmp/*
[hadoop@server3 ~]$ rm -fr /tmp/*
[hadoop@server4 ~]$ rm -fr /tmp/*
[hadoop@server5 ~]$ rm -fr /tmp/*
【5】将datanode数量改为3
[hadoop@server1 ~]$ cd hadoop/etc/hadoop/
[hadoop@server1 hadoop]$ vim hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
[hadoop@server1 hadoop]$ vim slaves
172.25.37.2
172.25.37.3
172.25.37.4
【6】配置zookeeper 集群
(1)解压文件
[hadoop@server1 ~]$ tar zxf zookeeper-3.4.9.tar.gz
[hadoop@server1 ~]$ cd zookeeper-3.4.9
[hadoop@server1 zookeeper-3.4.9]$ cd conf/
[hadoop@server1 conf]$ ls
configuration.xsl log4j.properties zoo_sample.cfg
(2)复制配置文件:
[hadoop@server1 conf]$ cp zoo_sample.cfg zoo.cfg
(3)主要做如下配置:
[hadoop@server1 conf]$ vim zoo.cfg
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/tmp/zookeeper
clientPort=2181
server.1=172.25.37.2:2888:3888
server.2=172.25.37.3:2888:3888
server.3=172.25.37.4:2888:3888
各节点配置文件相同,并且需要在/tmp/zookeeper 目录中创建 myid 文件,写入
一个唯一的数字,取值范围在 1-255。比如:172.25.37.2 节点的 myid 文件写入数
字“1”,此数字与配置文件中的定义保持一致,(server.1=172.25.37.2:2888:3888
)其它节点依次类推。
配置参数详解:
clientPort
客户端连接 server 的端口,即对外服务端口,一般设置为 2181 吧。
dataDir
存储快照文件 snapshot 的目录。默认情况下,事务日志也会存储在这里。建议同时配置参
数 dataLogDir, 事务日志的写性能直接影响 zk 性能。
tickTime
ZK 中的一个时间单元。ZK 中所有时间都是以这个时间单元为基础,以毫秒计,用来调节
心跳和超时。例如,session 的最小超时时间是 2*tickTime。
dataLogDir
事务日志输出目录。尽量给事务日志的输出配置单独的磁盘或是挂载点,这将极大的提升
ZK 性能。
[hadoop@server2 ~]$ mkdir /tmp/zookeeper
[hadoop@server2 ~]$ echo 1 > /tmp/zookeeper/myid
[hadoop@server3 ~]$ mkdir /tmp/zookeeper
[hadoop@server3 ~]$ echo 2 > /tmp/zookeeper/myid
[hadoop@server4 ~]$ mkdir /tmp/zookeeper
[hadoop@server4 ~]$ echo 3 > /tmp/zookeeper/myid
(4)开启服务
[hadoop@server2 ~]$ cd zookeeper-3.4.9
[hadoop@server2 zookeeper-3.4.9]$ cd bin/
[hadoop@server2 bin]$ ./zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@server2 bin]$
[hadoop@server3 ~]$ cd zookeeper-3.4.9
[hadoop@server3 zookeeper-3.4.9]$ cd bin/
[hadoop@server3 bin]$ ./
README.txt zkCli.cmd zkEnv.cmd zkServer.cmd
zkCleanup.sh zkCli.sh zkEnv.sh zkServer.sh
[hadoop@server3 bin]$ ./zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@server3 bin]$
[hadoop@server4 ~]$ cd zookeeper-3.4.9
[hadoop@server4 zookeeper-3.4.9]$ cd bin/
[hadoop@server4 bin]$ ./zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@server4 bin]$
(4)查看各节点看状态
[hadoop@server2 bin]$ ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Mode: follower
[hadoop@server2 bin]$
[hadoop@server3 bin]$ ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Mode: leader
[hadoop@server3 bin]$
[hadoop@server4 bin]$ ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /home/hadoop/zookeeper-3.4.9/bin/../conf/zoo.cfg
Mode: follower
[hadoop@server4 bin]$
你可以看到server3被选举为leader
(5)查看进程
[hadoop@server2 bin]$ jps
1658 QuorumPeerMain
1803 Jps
[hadoop@server3 bin]$ jps
1766 Jps
1704 QuorumPeerMain
[hadoop@server4 bin]$ jps
1245 QuorumPeerMain
1319 Jps
(6)zookeeper的交互式界面
[hadoop@server2 bin]$ pwd
/home/hadoop/zookeeper-3.4.9/bin
[hadoop@server2 bin]$ ls
README.txt zkCli.cmd zkEnv.cmd zkServer.cmd zookeeper.out
zkCleanup.sh zkCli.sh zkEnv.sh zkServer.sh
执行脚本进入交互式界面
[hadoop@server2 bin]$ ./zkCli.sh
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0]
[zk: localhost:2181(CONNECTED) 1] ls /zookeeper
[quota]
[zk: localhost:2181(CONNECTED) 2] ls /zookeeper/quota
[]
[zk: localhost:2181(CONNECTED) 3] get /zookeeper/quota
cZxid = 0x0
ctime = Thu Jan 01 08:00:00 CST 1970
mZxid = 0x0
mtime = Thu Jan 01 08:00:00 CST 1970
pZxid = 0x0
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 0
[zk: localhost:2181(CONNECTED) 4] quit
Quitting...
2018-03-10 16:18:19,399 [myid:] - INFO [main:ZooKeeper@684] - Session: 0x1620ef562a30002 closed
2018-03-10 16:18:19,400 [myid:] - INFO [main-EventThread:ClientCnxn$EventThread@519] - EventThread shut down for session: 0x1620ef562a30002
【7】部署高可用
[hadoop@server1 ~]$ cd hadoop/etc/hadoop/
[hadoop@server1 hadoop]$ pwd
/home/hadoop/hadoop/etc/hadoop
[hadoop@server1 hadoop]$ vim core-site.xml
<configuration>
<!-- 指定 hdfs 的 namenode 为 masters (名称可自定义)-->
<property>
<name>fs.defaultFS</name>
<value>hdfs://masters</value>
</property>
<!-- 指定 zookeeper 集群主机地址 -->
<property>
<name>ha.zookeeper.quorum</name>
<value>172.25.37.2:2181,172.25.37.3:2181,172.25.37.4:2181</value>
</property>
</configuration>
配置文件hdfs-site.xml:
[hadoop@server1 hadoop]$ vim hdfs-site.xml
<configuration>
<!-- 指定 hdfs 的 nameservices 为 masters,和 core-site.xml 文件中的设置保持一致 -->
<property>
<name>dfs.nameservices</name>
<value>masters</value>
</property>
<!-- masters 下面有两个 namenode 节点,分别是 h1 和 h2 (名称可自定义)
-->
<property>
<name>dfs.ha.namenodes.masters</name>
<value>h1,h2</value>
</property>
<!-- 指定 h1 节点的 rpc 通信地址 -->
<property>
<name>dfs.namenode.rpc-address.masters.h1</name>
<value>172.25.37.1:9000</value>
</property>
<!-- 指定 h1 节点的 http 通信地址 -->
<property>
<name>dfs.namenode.http-address.masters.h1</name>
<value>172.25.37.1:50070</value>
</property>
<!-- 指定 h2 节点的 rpc 通信地址 -->
<property>
<name>dfs.namenode.rpc-address.masters.h2</name>
<value>172.25.37.5:9000</value>
</property>
<!-- 指定 h2 节点的 http 通信地址 -->
<property>
<name>dfs.namenode.http-address.masters.h2</name>
<value>172.25.37.5:50070</value>
</property>
<!-- 指定 NameNode 元数据在 JournalNode 上的存放位置 -->
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://172.25.37.2:8485;172.25.37.3:8485;172.25.37.4:8485/masters</value>
</property>
<!-- 指定 JournalNode 在本地磁盘存放数据的位置 -->
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/tmp/journaldata</value></property>
<!-- 开启 NameNode 失败自动切换 -->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<!-- 配置失败自动切换实现方式 -->
<property>
<name>dfs.client.failover.proxy.provider.masters</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvid
er</value>
</property>
<!-- 配置隔离机制方法,每个机制占用一行-->
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property>
<!-- 使用 sshfence 隔离机制时需要 ssh 免密码 -->
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
<!-- 配置 sshfence 隔离机制超时时间 -->
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
</configuration>
【8】在三个 DN 上依次启动 journalnode(第一次启动 hdfs 必须先启动 journalnode)
[hadoop@server2 hadoop]$ pwd
/home/hadoop/hadoop
[hadoop@server2 hadoop]$ sbin/hadoop-daemon.sh start journalnode
starting journalnode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-journalnode-server2.out
[hadoop@server2 hadoop]$ jps
1658 QuorumPeerMain
1877 Jps
1827 JournalNode
[hadoop@server3 ~]$ cd hadoop
[hadoop@server3 hadoop]$ sbin/hadoop-daemon.sh start journalnode
starting journalnode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-journalnode-server3.out
[hadoop@server3 hadoop]$ jps
1790 JournalNode
1840 Jps
1704 QuorumPeerMain
[hadoop@server3 hadoop]$
[hadoop@server4 ~]$ cd hadoop
[hadoop@server4 hadoop]$ sbin/hadoop-daemon.sh start journalnode
starting journalnode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-journalnode-server4.out
[hadoop@server4 hadoop]$ jps
1245 QuorumPeerMain
1344 JournalNode
1394 Jps
[hadoop@server4 hadoop]$
【9】格式化 HDFS 集群
[hadoop@server1 hadoop]$ bin/hdfs namenode -format
【10】将/tmp/hadoop-hadoop目录发送到server5
[hadoop@server1 hadoop]$ scp -r /tmp/hadoop-hadoop 172.25.37.5:/tmp
seen_txid 100% 2 0.0KB/s 00:00
VERSION 100% 202 0.2KB/s 00:00
fsimage_0000000000000000000.md5 100% 62 0.1KB/s 00:00
fsimage_0000000000000000000 100% 352 0.3KB/s 00:00
【11】格式化 zookeeper
[hadoop@server1 hadoop]$ bin/hdfs zkfc -formatZK
【12】创建目录用于测试
[hadoop@server1 hadoop]$ ./bin/hdfs dfs -mkdir /user
[hadoop@server1 hadoop]$ ./bin/hdfs dfs -mkdir /user/hadoop
[hadoop@server1 hadoop]$ ./bin/hdfs dfs -put etc/hadoop/ /user/hadoop/input
[hadoop@server1 hadoop]$ ./bin/hdfs dfs -ls
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2018-06-02 14:13 input
【13】启动 hdfs 集群
[hadoop@server1 hadoop]$ sbin/start-dfs.sh
Starting namenodes on [server1 server5]
server1: starting namenode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-namenode-server1.out
server5: starting namenode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-namenode-server5.out
172.25.37.3: starting datanode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-datanode-server3.out
172.25.37.2: starting datanode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-datanode-server2.out
172.25.37.4: starting datanode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-datanode-server4.out
Starting journal nodes [172.25.37.2 172.25.37.3 172.25.37.4]
172.25.37.4: starting journalnode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-journalnode-server4.out
172.25.37.3: starting journalnode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-journalnode-server3.out
172.25.37.2: starting journalnode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-journalnode-server2.out
Starting ZK Failover Controllers on NN hosts [server1 server5]
server1: starting zkfc, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-zkfc-server1.out
server5: starting zkfc, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-zkfc-server5.out
查看进程
[hadoop@server1 hadoop]$ jps
6209 Jps
5847 NameNode
6141 DFSZKFailoverController
[hadoop@server5 ~]$ jps
1548 Jps
1416 DFSZKFailoverController
1319 NameNode
[hadoop@server2 hadoop]$ jps
1661 JournalNode
1726 Jps
1224 QuorumPeerMain
1568 DataNode
[hadoop@server3 hadoop]$ jps
1776 Jps
1616 DataNode
1709 JournalNode
1213 QuorumPeerMain
[hadoop@server4 hadoop]$ jps
1204 QuorumPeerMain
1562 DataNode
1655 JournalNode
1723 Jps
【14】浏览器测试
你可以看到server1状态为active,server5状态处于standby。
【5】高可用测试测试:
关闭状态处于active的namenode,我的是server1:
[hadoop@server1 hadoop]$ jps
2611 DFSZKFailoverController
2314 NameNode
3671 Jps
[hadoop@server1 hadoop]$ kill -9 2314
在浏览器中可以看到server5状态变为active,server1状态变为standby。
[hadoop@server1 hadoop]$ ./sbin/hadoop-daemon.sh start namenode
五、配置yarn高可用
同样是两个配置文件:
【1】编辑 mapred-site.xml 文件
<configuration>
<!-- 指定 yarn 为 MapReduce 的框架 -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
【2】编辑 yarn-site.xml 文件
<configuration>
<!-- 配置可以在 nodemanager 上运行 mapreduce 程序 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!-- 激活 RM 高可用 -->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property><!-- 指定 RM 的集群 id -->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>RM_CLUSTER</value>
</property>
<!-- 定义 RM 的节点-->
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<!-- 指定 RM1 的地址 -->
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>172.25.37.1</value>
</property>
<!-- 指定 RM2 的地址 -->
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>172.25.37.5</value>
</property>
<!-- 激活 RM 自动恢复 -->
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<!-- 配置 RM 状态信息存储方式,有 MemStore 和 ZKStore-->
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</
value>
</property>
<!-- 配置为 zookeeper 存储时,指定 zookeeper 集群的地址 -->
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>172.25.37.2:2181,172.25.37.3:2181,172.25.37.4:2181</value>
</property>
</configuration>
【3】启动 yarn 服务
[hadoop@server1 hadoop]$ ./sbin/start-yarn.sh
[hadoop@server1 hadoop]$ jps
6559 Jps
2163 NameNode
1739 DFSZKFailoverController
5127 ResourceManager
RM2 上需要手动启动
$ sbin/yarn-daemon.sh start resourcemanager
[hadoop@server5 hadoop]$ jps
1191 NameNode
3298 Jps
1293 DFSZKFailoverController
2757 ResourceManager
最好是把 RM 与 NN 分离运行,这样可以更好的保证程序的运行性能。
【5】浏览器测试
你可以看到server5的状态为standby,server1的状态为active。
【4】测试 yarn 故障切换
[hadoop@server1 hadoop]$ jps
5918 Jps
2163 NameNode
1739 DFSZKFailoverController
5127 ResourceManager
[hadoop@server1 hadoop]$ kill -9 5127
[hadoop@server1 hadoop]$ ./sbin/yarn-daemon.sh start resourcemanager
在浏览器截图中你可以发现server1和server5的状态发生了改变。