Hadoop集群添加、删除节点

23 篇文章 0 订阅

hadoop群集部署,请见上篇
https://blog.csdn.net/oToyix/article/details/118520585

一、Hadoop节点扩容

随着公司业务不断的发展,数据量也越来越大,此时需要对Hadoop集群规模进行扩容,在现有Hadoop 3台集群的基础上动态增加node4服务器上的DataNode与NodeManager节点。操作方法和步骤如下:

1、Hosts及防火墙设置

node1、node2、node3、node4节点进行如下配置:

cat >/etc/hosts<<EOF
127.0.0.1  localhost localhost.localdomain
192.168.0.47 node1     
192.168.0.32 node2
192.168.0.33 node3
192.168.0.37 node4
EOF
sed -i '/SELINUX/s/enforcing/disabled/g'  /etc/sysconfig/selinux
setenforce  0
systemctl   stop     firewalld.service
systemctl   disable   firewalld.service
yum install ntpdate rsync lrzsz -y
ntpdate  pool.ntp.org
hostname `cat /etc/hosts|grep $(ifconfig|grep broadcast|awk '{print $2}')|awk '{print $2}'`;su
2、 配置节点免秘钥登录

Node1节点作为Master控制节点,执行如下指令创建公钥和私钥,然后将公钥拷贝至其余节点即可。

ssh-copy-id -i /root/.ssh/id_rsa.pub root@node4
3、 配置节点JAVA环境
#解压JDK软件包;
 tar -xf jdk-8u191-linux-x64.tar.gz
#创建JDK部署目录;
mkdir -p  /usr/java/
\mv jdk1.8.0_191 /usr/java/
#设置环境变量;
cat>>/etc/profile<<EOF
export JAVA_HOME=/usr/java/jdk1.8.0_191/
export HADOOP_HOME=/data/hadoop/
export JAVA_LIBRARY_PATH=/data/hadoop/lib/native/
export PATH=\$PATH:\$HADOOP_HOME/bin/:\$JAVA_HOME/bin
EOF
#使其环境变量生效;
source /etc/profile
java -version
4、 Hadoop服务部署

将Node1部署完成的hadoop所有文件、目录同步至新节点 node4节点,操作指令如下:

for i in node4;do ssh -l root $i -a "mkdir -p /data/hadoop/" ;done
for i in node4;do rsync -aP --delete /data/hadoop/ root@$i:/data/hadoop/ ;done
for i in node4;do ssh -l root $i -a "rm -rf /data/hadoop/data* ";done
5、 添加Hadoop新节点

1)、动态添加DataNode和NodeManager节点,查看现有HDFS各节点状态,命令操作如下:
hdfs dfsadmin -report

hdfs dfsadmin -report
Configured Capacity: 493629759488 (459.73 GB)
Present Capacity: 465156300800 (433.21 GB)
DFS Remaining: 465156218880 (433.21 GB)
DFS Used: 81920 (80 KB)
DFS Used%: 0.00%
Replicated Blocks:
        Under replicated blocks: 0
        Blocks with corrupt replicas: 0
        Missing blocks: 0
        Missing blocks (with replication factor 1): 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0
Erasure Coded Block Groups: 
        Low redundancy block groups: 0
        Block groups with corrupt internal blocks: 0
        Missing block groups: 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0
-------------------------------------------------
Live datanodes (3):
Name: 192.168.0.32:9866 (node2)
Hostname: node2
Decommission Status : Normal
Configured Capacity: 165258731520 (153.91 GB)
DFS Used: 32768 (32 KB)
Non DFS Used: 9225895936 (8.59 GB)
DFS Remaining: 156032802816 (145.32 GB)
DFS Used%: 0.00%
DFS Remaining%: 94.42%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Jul 06 17:04:48 CST 2021
Last Block Report: Tue Jul 06 15:53:18 CST 2021
Num of Blocks: 2

Name: 192.168.0.33:9866 (node3)
Hostname: node3
Decommission Status : Normal
Configured Capacity: 165258731520 (153.91 GB)
DFS Used: 16384 (16 KB)
Non DFS Used: 9225396224 (8.59 GB)
DFS Remaining: 156033318912 (145.32 GB)
DFS Used%: 0.00%
DFS Remaining%: 94.42%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Jul 06 17:04:48 CST 2021
Last Block Report: Tue Jul 06 16:07:00 CST 2021
Num of Blocks: 0

Name: 192.168.0.47:9866 (node1)
Hostname: node1
Decommission Status : Normal
Configured Capacity: 163112296448 (151.91 GB)
DFS Used: 32768 (32 KB)
Non DFS Used: 10022166528 (9.33 GB)
DFS Remaining: 153090097152 (142.58 GB)
DFS Used%: 0.00%
DFS Remaining%: 93.86%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Jul 06 17:04:48 CST 2021
Last Block Report: Tue Jul 06 12:13:35 CST 2021
Num of Blocks: 2

可以看到添加DataNode节点之前,DataNode节点总共有3个,分别在node1、“node2”和“node3”服务器上。

2)、查看YARN各节点状态,命令操作如下;
yarn node -list

[root@node1 ~]# yarn node -list
2021-07-06 17:06:03,997 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
Total Nodes:1
         Node-Id             Node-State Node-Http-Address       Number-of-Running-Containers
     node1:42465                RUNNING        node1:8042                                  0

可以看到添加NodeManager之前,NodeManager进程运行在node1、“node2”和“node3”服务器上。

3)、添加DataNode和NodeManager节点,在所有服务器的Hadoop workers文件中添加node4节点,命令操作如下:

echo node4 >>/data/hadoop/etc/hadoop/workers ;cat /data/hadoop/etc/hadoop/workers

4)、在Node4新增节点服务器上启动:DataNode和NodeManager服务,命令操作如下:

hdfs --daemon start datanode
yarn --daemon start nodemanager

5)、在Node1服务器上执行如下命令,刷新Hadoop集群节点,操作指令如下:

hdfs dfsadmin -refreshNodes
/data/hadoop/sbin/start-balancer.sh

6)、再次查看集群的状态,查看HDFS各节点状态
hdfs dfsadmin -report

[root@node1 ~]# hdfs dfsadmin -report
Configured Capacity: 658888491008 (613.64 GB)
Present Capacity: 621189627904 (578.53 GB)
DFS Remaining: 621189537792 (578.53 GB)
DFS Used: 90112 (88 KB)
DFS Used%: 0.00%
Replicated Blocks:
        Under replicated blocks: 0
        Blocks with corrupt replicas: 0
        Missing blocks: 0
        Missing blocks (with replication factor 1): 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0
Erasure Coded Block Groups: 
        Low redundancy block groups: 0
        Block groups with corrupt internal blocks: 0
        Missing block groups: 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0
-------------------------------------------------
Live datanodes (4):

Name: 192.168.0.32:9866 (node2)
Hostname: node2
Decommission Status : Normal
Configured Capacity: 165258731520 (153.91 GB)
DFS Used: 32768 (32 KB)
Non DFS Used: 9225781248 (8.59 GB)
DFS Remaining: 156032917504 (145.32 GB)
DFS Used%: 0.00%
DFS Remaining%: 94.42%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Jul 06 17:08:15 CST 2021
Last Block Report: Tue Jul 06 15:53:18 CST 2021
Num of Blocks: 2

Name: 192.168.0.33:9866 (node3)
Hostname: node3
Decommission Status : Normal
Configured Capacity: 165258731520 (153.91 GB)
DFS Used: 16384 (16 KB)
Non DFS Used: 9225297920 (8.59 GB)
DFS Remaining: 156033417216 (145.32 GB)
DFS Used%: 0.00%
DFS Remaining%: 94.42%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Jul 06 17:08:15 CST 2021
Last Block Report: Tue Jul 06 16:07:00 CST 2021
Num of Blocks: 0

Name: 192.168.0.37:9866 (node4)
Hostname: node4
Decommission Status : Normal
Configured Capacity: 165258731520 (153.91 GB)
DFS Used: 8192 (8 KB)
Non DFS Used: 9225617408 (8.59 GB)
DFS Remaining: 156033105920 (145.32 GB)
DFS Used%: 0.00%
DFS Remaining%: 94.42%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Jul 06 17:08:16 CST 2021
Last Block Report: Tue Jul 06 17:07:31 CST 2021
Num of Blocks: 0

Name: 192.168.0.47:9866 (node1)
Hostname: node1
Decommission Status : Normal
Configured Capacity: 163112296448 (151.91 GB)
DFS Used: 32768 (32 KB)
Non DFS Used: 10022166528 (9.33 GB)
DFS Remaining: 153090097152 (142.58 GB)
DFS Used%: 0.00%
DFS Remaining%: 93.86%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Jul 06 17:08:15 CST 2021
Last Block Report: Tue Jul 06 12:13:35 CST 2021
Num of Blocks: 2

可以看到,添加DataNode节点后,输出的结果中存在Node4服务器上的DataNode节点。说明添加DataNode节点成功。
hadoop fs -ls / 可以在node4上查看集群数据

二、删除节点 node4 -方法一

删除Hadoop节点
随着公司业务不断的发展,Hadoop服务器使用年限长,难免要淘汰下架一些服务器,此时需要对Hadoop集群规模进行缩容,在现有Hadoop 4台集群的基础上动态删除node4服务器上的DataNode与NodeManager节点。操作方法和步骤如下:

1、删除Node4上DataNode与NodeManager节点,停止DataNode和NodeManager进程,操作指令如下:
hdfs --daemon stop datanode
yarn --daemon stop nodemanager
ps -ef|grep hadoop|grep -v grep|awk '{print $2}'|xargs kill -9
2、删除每台服务器上Hadoop的workers文件中的Node4,删除后的文件内容如下:
sed -i '/^node4$/d' /data/hadoop/etc/hadoop/workers ;cat /data/hadoop/etc/hadoop/workers
3、在node1服务器上执行如下命令,刷新Hadoop集群节点,操作指令如下:
hdfs dfsadmin -refreshNodes
/data/hadoop/sbin/start-balancer.sh
4、查看HDFS各节点状态,操作指令如下:
hdfs dfsadmin -report

只要可以看到,在输出的信息中没有node4服务器上的DataNode节点,即删除成功。

三、删除节点 node4 -方法二

动态删除DataNode节点与NodeManager节点的另一种方式,这种方式不需要删除workers文件中现有的node4服务器配置。

1、Node4上删除DataNode与NodeManager节点,停止DataNode和NodeManager进程
hdfs --daemon stop datanode
yarn --daemon stop nodemanager
ps -ef|grep hadoop|grep -v grep|awk '{print $2}'|xargs kill -9
2、在Node1节点上修改hdfs-site.xml文件,适当减小dfs.replication副本数,增加dfs.hosts.exclude配置如下:

vim /data/hadoop/etc/hadoop/hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/data/hadoop/data_name1,/data/hadoop/data_name2</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/data/hadoop/data_1,/data/hadoop/data_2</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>										添加此行
    <name>dfs.hosts.exclude</name>				添加此行
<value>/data/hadoop/etc/hadoop/excludes</value>	添加此行
</property>										添加此行
</configuration>
3、在node1服务器上的/data/hadoop/etc/hadoop/目录下创建excludes文件,将要删除的node4服务器节点的主机名或IP地址配置到这个文件中,具体如下:
vim /data/hadoop/etc/hadoop/excludes
node4


注: 这里要修改/data/hadoop/etc/hadoop/excludes  文件的宿主用户,我这里默认是docker用户,不是hadoop,看同级文件属性即可
chown docker:docker /data/hadoop/etc/hadoop/excludes
4、刷新节点,在node1服务器上执行如下命令,刷新Hadoop集群节点。
hdfs dfsadmin -refreshNodes
/data/hadoop/sbin/start-balancer.sh
5、查看HDFS各节点状态
hdfs dfsadmin -report

只要结果中不存在node4或 node4状态为 Dead datanodes即可

这种方式也可以实现动态删除DataNode和NodeManager节点。

---------------------------end

  • 2
    点赞
  • 19
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值