四、服役新数据节点 & 退役旧数据节点

环境准备

  • 随着公司业务的增长,数据量越来越大,原有的数据节点的容量已经不能满足存储数据的需求,需要在原有集群基础上动态添加新的数据节点。
  • (1)克隆一台虚拟机(克隆cslave1为cslave2)
  • (2)修改ip地址和主机名称(hostname:cslave2;ip:192.168.1.104)
  • 需要修改的地方:
vi /etc/udev/rules.d/70-persistent-net.rules
#注释掉eth0这一行,把eth1改为eth0,并复制address物理地址

vi /etc/sysconfig/network-scripts/ifcfg-eth0 
#修改IPADDR,修改HWADDR地址为刚刚复制的物理地址

vi /etc/sysconfig/network
#修改主机HOSTNAME为cslave2

vi /etc/hosts
#加入ip、hostname映射
192.168.1.101 cmaster0
192.168.1.102 cslave0
192.168.1.103 cslave1
192.168.1.104 cslave2
  • (3)在cslave2上,删除原来HDFS文件系统留存的文件(因为复制的cslave1,所以整个/opt/module目录是存在的)
[hadoop@cslave2 tmp]$ pwd
/opt/module/hadoop-2.7.2/data/tmp
[hadoop@cslave2 tmp]$ rm -rf *
配置ssh免密码
[hadoop@cslave2 ~]$ cd
[hadoop@cslave2 ~]$ rm -rf .ssh/
[hadoop@cslave2 ~]$ ssh-keygen -t rsa
[hadoop@cslave2 .ssh]$ cd ~/.ssh
[hadoop@cslave2 .ssh]$ cp id_rsa.pub authorized_keys
##分发ssh公钥(所有节点都要做),这一步需要在集群中每台机器都做
##把各个节点的authorized_keys的内容互相拷贝加入到对方的此文件中,然后就可以免密码彼此ssh连入
测试ssh(所有节点都要做)
# ssh cmaster0 date
# ssh cslave0 date
# ssh cslave1 date
# ssh cslave2 date
集群部署规划
/cmaster0cslave0cslave1cslave2
HDFSDataNodeDataNodeDataNodeDataNode
HDFSNameNode/SecondaryNameNode/
YARNNodeManagerNodeManagerNodeManagerNodeManager
YARN/ResourceManager//

服役新数据节点

NodeLast contactAdmin StateCapacityUsedNon DFS UsedRemainingBlocksBlock pool usedFailed VolumesVersion
cmaster0:50010 (192.168.1.101:50010)2In Service27.01 GB256 KB5.72 GB21.29 GB9256 KB (0%)02.7.2
cslave0:50010 (192.168.1.102:50010)2In Service27.01 GB256 KB5.38 GB21.63 GB9256 KB (0%)02.7.2
cslave1:50010 (192.168.1.103:50010)2In Service27.01 GB256 KB5.38 GB21.63 GB9256 KB (0%)02.7.2
  • (1)在namenode的/opt/module/hadoop-2.7.2/etc/hadoop目录下创建dfs.hosts文件,包含所有服役的节点
[hadoop@cmaster0 hadoop]$ pwd
/opt/module/hadoop-2.7.2/etc/hadoop
[hadoop@cmaster0 hadoop]$ vi dfs.hosts 
cmaster0
cslave0
cslave1
cslave2
  • (2)在namenode的hdfs-site.xml配置文件中增加dfs.hosts属性
<property>
    <name>dfs.hosts</name>
    <value>/opt/module/hadoop-2.7.2/etc/hadoop/dfs.hosts</value>
</property>
  • (3)刷新namenode
[hadoop@cmaster0 hadoop]$ hdfs dfsadmin -refreshNodes
Refresh nodes successful
NodeLast contactAdmin StateCapacityUsedNon DFS UsedRemainingBlocksBlock pool usedFailed VolumesVersion
cmaster0:50010 (192.168.1.101:50010)2In Service27.01 GB256 KB5.72 GB21.29 GB9256 KB (0%)02.7.2
cslave0:50010 (192.168.1.102:50010)2In Service27.01 GB256 KB5.38 GB21.63 GB9256 KB (0%)02.7.2
cslave1:50010 (192.168.1.103:50010)2In Service27.01 GB256 KB5.38 GB21.63 GB9256 KB (0%)02.7.2
cslave2:50010 (192.168.1.104:50010)Wed Dec 26 19:32:45 UTC+0800 2018Dead--------
  • (4)更新resourcemanager节点
[hadoop@cmaster0 hadoop]$ yarn rmadmin -refreshNodes
18/12/27 04:47:16 INFO client.RMProxy: Connecting to ResourceManager at cslave0/192.168.1.102:8033
  • (5)在namenode的slaves文件中增加新主机名称(不需要分发)
  • 下一次启动集群的时候,才能认识新增的datanode节点cslave2
[hadoop@cmaster0 hadoop]$ vi slaves 
cmaster0
cslave0
cslave1
cslave2
  • (6)单独命令启动新的数据节点和节点管理器
[hadoop@cslave2 hadoop]$ hadoop-daemon.sh start datanode
starting datanode, logging to /opt/module/hadoop-2.7.2/logs/hadoop-hadoop-datanode-cslave2.out
[hadoop@cslave2 hadoop]$ yarn-daemon.sh start nodemanager
starting nodemanager, logging to /opt/module/hadoop-2.7.2/logs/yarn-hadoop-nodemanager-cslave2.out
[hadoop@cslave2 hadoop]$ jps
3035 Jps
3003 NodeManager
2907 DataNode
NodeLast contactAdmin StateCapacityUsedNon DFS UsedRemainingBlocksBlock pool usedFailed VolumesVersion
cmaster0:50010 (192.168.1.101:50010)1In Service27.01 GB256 KB5.72 GB21.29 GB9256 KB (0%)02.7.2
cslave0:50010 (192.168.1.102:50010)1In Service27.01 GB256 KB5.38 GB21.63 GB9256 KB (0%)02.7.2
cslave1:50010 (192.168.1.103:50010)1In Service27.01 GB256 KB5.38 GB21.63 GB9256 KB (0%)02.7.2
cslave2:50010 (192.168.1.104:50010)1In Service27.01 GB24 KB5.38 GB21.63 GB024 KB (0%)02.7.2
  • (7)如果数据不均衡,可以用命令实现集群的再平衡
[hadoop@cslave2 hadoop-2.7.2]$ pwd
/opt/module/hadoop-2.7.2
[hadoop@cslave2 hadoop-2.7.2]$ ./sbin/start-balancer.sh 
starting balancer, logging to /opt/module/hadoop-2.7.2/logs/hadoop-hadoop-balancer-cslave2.out

退役旧数据节点

  • 1)在namenode的/opt/module/hadoop-2.7.2/etc/hadoop目录下创建dfs.hosts.exclude文件,仅包含要退役的节点
[hadoop@cmaster0 hadoop]$ vi dfs.hosts.exclude 
cslave2
  • 2)在namenode的hdfs-site.xml配置文件中增加dfs.hosts.exclude属性
[hadoop@cmaster0 hadoop]$ vi hdfs-site.xml
<property>
    <name>dfs.hosts.exclude</name>
    <value>/opt/module/hadoop-2.7.2/etc/hadoop/dfs.hosts.exclude</value>
</property>
  • 3)刷新namenode、刷新resourcemanager
[hadoop@cmaster0 hadoop]$ hdfs dfsadmin -refreshNodes
Refresh nodes successful
[hadoop@cmaster0 hadoop]$ yarn rmadmin -refreshNodes
18/12/27 05:24:08 INFO client.RMProxy: Connecting to ResourceManager at cslave0/192.168.1.102:8033
NodeLast contactAdmin StateCapacityUsedNon DFS UsedRemainingBlocksBlock pool usedFailed VolumesVersion
cmaster0:50010 (192.168.1.101:50010)1In Service27.01 GB256 KB5.72 GB21.29 GB10256 KB (0%)02.7.2
cslave0:50010 (192.168.1.102:50010)1In Service27.01 GB256 KB5.38 GB21.63 GB10256 KB (0%)02.7.2
cslave1:50010 (192.168.1.103:50010)1In Service27.01 GB256 KB5.38 GB21.63 GB10256 KB (0%)02.7.2
cslave2:50010 (192.168.1.104:50010)1Decommission In Progress27.01 GB24 KB5.38 GB21.63 GB124 KB (0%)02.7.2
  • 检查web浏览器,退役节点的状态为decommission in progress(退役中),说明数据节点正在复制块到其他节点。

  • 等待退役节点状态为decommissioned(所有块已经复制完成),停止该节点及节点资源管理器。注意:如果副本数是3,服役后的节点小于等于3,是不能退役成功的,需要修改副本数后才能退役。

  • 4)停止该节点及节点资源管理器

[hadoop@cslave2 ~]$ jps
3003 NodeManager
2907 DataNode
3344 Jps
[hadoop@cslave2 ~]$ hadoop-daemon.sh stop datanode
stopping datanode
[hadoop@cslave2 ~]$ yarn-daemon.sh stop nodemanager
stopping nodemanager
[hadoop@cslave2 ~]$ jps
3450 Jps
  • 5)从include文件中删除退役节点,再运行刷新节点的命令
  • (1)从namenode的dfs.hosts文件中删除退役节点cslave2
[hadoop@cmaster0 hadoop]$ vi dfs.hosts
cmaster0
cslave0
cslave1
  • (2)刷新namenode,刷新resourcemanager
[hadoop@cmaster0 hadoop]$ hdfs dfsadmin -refreshNodes
Refresh nodes successful
[hadoop@cmaster0 hadoop]$ yarn rmadmin -refreshNodes
18/12/27 05:37:21 INFO client.RMProxy: Connecting to ResourceManager at cslave0/192.168.1.102:8033
NodeLast contactAdmin StateCapacityUsedNon DFS UsedRemainingBlocksBlock pool usedFailed VolumesVersion
cmaster0:50010 (192.168.1.101:50010)1In Service27.01 GB256 KB5.72 GB21.29 GB10256 KB (0%)02.7.2
cslave0:50010 (192.168.1.102:50010)1In Service27.01 GB256 KB5.38 GB21.63 GB10256 KB (0%)02.7.2
cslave1:50010 (192.168.1.103:50010)1In Service27.01 GB256 KB5.38 GB21.63 GB10256 KB (0%)02.7.2
  • 6)从namenode的slave文件中删除退役节点cslave2
[hadoop@cmaster0 hadoop]$ vi slaves 
cmaster0
cslave0
cslave1
  • 7)如果数据不均衡,可以用命令实现集群的再平衡
[hadoop@cslave2 hadoop-2.7.2]$ pwd
/opt/module/hadoop-2.7.2
[hadoop@cslave2 hadoop-2.7.2]$ ./sbin/start-balancer.sh 
starting balancer, logging to /opt/module/hadoop-2.7.2/logs/hadoop-hadoop-balancer-cslave2.out
  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值