一、新节点操作系统配置(可以参考hadoop2.6.0三节点集群环境搭建(一))
(1) 主机名更改
(2) Ip地址配置
(3) /etc/hosts文件配置
(4) Jdk安装
(5) /etc/hosts文件配置防火墙关闭
(6) Selinux关闭
(7) vm.swappiness参数配置
(8) 创建hadoop用户及目录
(9) 无密码登入
二、 更新所有节点的/etc/hosts
[root@master training]# scp /etc/hosts slave1:/etc/
[root@master training]# scp /etc/hosts slave2:/etc/
[root@master training]# scp /etc/hosts hadoop04:/etc/
三、 无密码登入,即节点间信任关系
On hodoop04
su - hadoop
ssh-keygen -t rsa -P ''
cat .ssh/id_rsa.pub >> .ssh/authorized_keys
scp .ssh/id_rsa.pub hadoop@master:/home/hadoop/id_rsa_04.pub
on master
su - hadoop
cat id_rsa_04.pub >> .ssh/authorized_keys
scp .ssh/authorized_keys hadoop@hadoop04:/home/hadoop/.ssh/authorized_keys
四、 更改/opt目录的权限(将来用hadoop用户将主节点hadoop软件考到hadoop04节点)
[root@hadoop04 training]# chmod 775 /opt
五、 拷贝hadoop家目录到hadoop04节点
On master
[hadoop@master opt]$ scp -r hadoop-2.6.0 hadoop04:/opt
创建链接
On hadoop04
[hadoop@hadoop04 opt]$ ln -s hadoop-2.6.0 hadoop
六、 更新hadoop04节点hadoop用户的环境变量
On master
[hadoop@master ~]$ scp /home/hadoop/.bash_profile hadoop04:/home/hadoop/
七、 修改master节点的slaves文件
[hadoop@master hadoop]$ vi slaves
slave1
slave2
hadoop04
八、 Hadoop04节点启动datanode和nodemanager服务
[hadoop@hadoop04 opt]$ hadoop-daemon.sh start datanode
[hadoop@hadoop04 opt]$ jps
2631 DataNode
2691 Jps
[hadoop@hadoop04 opt]$ yarn-daemon.sh start nodemanager
[hadoop@hadoop04 opt]$ jps
2631 DataNode
2851 Jps
2755 NodeManager
[hadoop@hadoop04 opt]$
九、 查看集群状态
[hadoop@master hadoop]$ hdfs dfsadmin -report
Configured Capacity: 75593183232 (70.40 GB)
Present Capacity: 55412535296 (51.61 GB)
DFS Remaining: 55412412416 (51.61 GB)
DFS Used: 122880 (120 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Live datanodes (3): 《=====有三个datanode节点,说明添加成功
Name: 192.168.123.11:50010 (slave1)
Hostname: slave1
Decommission Status : Normal
Configured Capacity: 25197727744 (23.47 GB)
DFS Used: 49152 (48 KB)
Non DFS Used: 6726680576 (6.26 GB)
DFS Remaining: 18470998016 (17.20 GB)
DFS Used%: 0.00%
DFS Remaining%: 73.30%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 16 23:23:20 EDT 2017
Name: 192.168.123.13:50010 (hadoop04) 《====新添加的节点
Hostname: hadoop04
Decommission Status : Normal
Configured Capacity: 25197727744 (23.47 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 6727045120 (6.27 GB)
DFS Remaining: 18470658048 (17.20 GB)
DFS Used%: 0.00%
DFS Remaining%: 73.30%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 16 23:23:20 EDT 2017
Name: 192.168.123.12:50010 (slave2)
Hostname: slave2
Decommission Status : Normal
Configured Capacity: 25197727744 (23.47 GB)
DFS Used: 49152 (48 KB)
Non DFS Used: 6726922240 (6.26 GB)
DFS Remaining: 18470756352 (17.20 GB)
DFS Used%: 0.00%
DFS Remaining%: 73.30%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 16 23:23:20 EDT 2017
节点刷新
On master
[hadoop@master hadoop]$ yarn rmadmin –refreshNodes
负载均衡
[hadoop@master hadoop]$ hdfs balancer --help
Usage: java Balancer
[-policy <policy>] the balancing policy: datanode or blockpool
[-threshold <threshold>] Percentage of disk capacity
[-exclude [-f <hosts-file> | comma-sperated list of hosts]] Excludes the specified datanodes.
[-include [-f <hosts-file> | comma-sperated list of hosts]] Includes only the specified datanodes.
Generic options supported are
-conf <configuration file> specify an application configuration file
-D <property=value> use value for given property
-fs <local|namenode:port> specify a namenode
-jt <local|resourcemanager:port> specify a ResourceManager
-files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]
[hadoop@master hadoop]$ start-balancer.sh -threshold 5
默认值是10%,这个值越小,集群数据分布越平均
---------------------
作者:forever19870418
来源:CSDN
原文:https://blog.csdn.net/forever19870418/article/details/62887072
版权声明:本文为博主原创文章,转载请附上博文链接!