Hadoop大数据应用:HDFS 集群节点扩容

本文详细描述了在HDFS集群扩容过程中,从环境配置、节点添加、IP调整、安全设置、rsync同步文件以及遇到的rsync同步报错及其原因分析和解决步骤,包括主机名修正和带宽平衡的设置。
摘要由CSDN通过智能技术生成

目录

 一、实验

1.环境

2.HDFS 集群节点扩容

二、问题

1.rsync 同步报错


 一、实验

1.环境

(1)主机

表1  主机

主机架构软件版本IP备注
hadoop

NameNode (已部署)

SecondaryNameNode (已部署)

ResourceManager(已部署)

hadoop

2.7.7192.168.204.50

node01

DataNode(已部署)

NodeManager(已部署)

hadoop

2.7.7192.168.204.51
node02

DataNode(已部署)

NodeManager(已部署)

hadoop

2.7.7192.168.204.52
node03

DataNode(已部署)

NodeManager(已部署)

hadoop

2.7.7192.168.204.53
node04

DataNode

hadoop

2.7.7192.168.204.54

(2)查看jps

hadoop节点

[root@hadoop hadoop]# jps

node01节点

node02节点

node03节点

(3) 查看节点

[root@hadoop hadoop]# ./bin/yarn node -list
24/03/14 13:40:21 INFO client.RMProxy: Connecting to ResourceManager at hadoop/192.168.204.50:8032
Total Nodes:3
         Node-Id             Node-State Node-Http-Address       Number-of-Running-Containers
    node01:40551                RUNNING       node01:8042                                  0
    node02:46073                RUNNING       node02:8042                                  0
    node03:40601                RUNNING       node03:8042                                  0

2.HDFS 集群节点扩容

(1)查看IP

地址为192.168.204.54

[root@localhost ~]# ip addr

 (2)安全机制

查看

[root@localhost ~]# sestatus

关闭

[root@localhost ~]# vim /etc/selinux/config
……
SELINUX=disabled
……

再次查看(需要reboot重启)

[root@localhost ~]# sestatus

(3)防火墙

关闭

[root@localhost ~]# systemctl stop firewalld
[root@localhost ~]# systemctl mask firewalld

(4)安装java

[root@localhost ~]# yum install -y java-1.8.0-openjdk-devel.x86_64

查看

[root@localhost ~]# jps

 (5)修改主机名

[root@localhost ~]# hostnamectl set-hostname node04
[root@localhost ~]# bash

(6)添加免密登录

[root@hadoop ~]# cd /root/.ssh/
[root@hadoop .ssh]# ls
authorized_keys  id_rsa  id_rsa.pub  known_hosts
[root@hadoop .ssh]# ssh-copy-id -i id_rsa.pub 192.168.204.54

验证:

[root@hadoop .ssh]# ssh 192.168.204.54

 (7)域名主机名(hadoop节点)

[root@hadoop ~]# vim /etc/hosts
……
192.168.205.50 hadoop
192.168.205.51 node01
192.168.205.52 node02
192.168.205.53 node03
192.168.204.54 node04

(8)同步域名配置文件

[root@hadoop ~]# rsync -av /etc/hosts node01:/etc/
sending incremental file list
hosts

sent 359 bytes  received 41 bytes  266.67 bytes/sec
total size is 269  speedup is 0.67
[root@hadoop ~]# rsync -av /etc/hosts node02:/etc/
sending incremental file list
hosts

sent 359 bytes  received 41 bytes  800.00 bytes/sec
total size is 269  speedup is 0.67
[root@hadoop ~]# rsync -av /etc/hosts node03:/etc/
sending incremental file list
hosts

sent 359 bytes  received 41 bytes  800.00 bytes/sec
total size is 269  speedup is 0.67
[root@hadoop ~]# rsync -av /etc/hosts node04:/etc/
Warning: Permanently added 'node04' (ECDSA) to the list of known hosts.
sending incremental file list
hosts

sent 359 bytes  received 41 bytes  266.67 bytes/sec
total size is 269  speedup is 0.67

(9)同步Hadoop文件

[root@hadoop ~]# rsync -aXSH --delete /usr/local/hadoop node04:/usr/local/

(10) 清除日志(node04节点)

[root@node04 ~]# cd /usr/local/hadoop/
[root@node04 hadoop]# ls
bin  etc  include  lib  libexec  LICENSE.txt  logs  NOTICE.txt  README.txt  sbin  share
[root@node04 hadoop]# cd logs/
[root@node04 logs]# ls
hadoop-root-namenode-hadoop.log    hadoop-root-secondarynamenode-hadoop.log    SecurityAuth-root.audit
hadoop-root-namenode-hadoop.out    hadoop-root-secondarynamenode-hadoop.out    yarn-root-resourcemanager-hadoop.log
hadoop-root-namenode-hadoop.out.1  hadoop-root-secondarynamenode-hadoop.out.1  yarn-root-resourcemanager-hadoop.out
[root@node04 logs]# rm -f *
[root@node04 logs]# ls

(11)查看slaves  (hadoop节点)

[root@hadoop ~]# cd /usr/local/hadoop/etc/hadoop/
[root@hadoop hadoop]# cat slaves

(12)添加slaves

 [root@hadoop hadoop]# vim slaves
  node01
  node02
  node03
  node04

(13)同步配置到所有主机

[root@hadoop hadoop]# rsync -aXSH --delete /usr/local/hadoop/etc node01:/usr/local/hadoop/
[root@hadoop hadoop]# rsync -aXSH --delete /usr/local/hadoop/etc node02:/usr/local/hadoop/
[root@hadoop hadoop]# rsync -aXSH --delete /usr/local/hadoop/etc node03:/usr/local/hadoop/
[root@hadoop hadoop]# rsync -aXSH --delete /usr/local/hadoop/etc node04:/usr/local/hadoop/

(14)启动服务 (node04节点)

[root@node04 hadoop]# ./sbin/hadoop-daemon.sh start datanode

查看jps

(15) 验证 (hadoop节点)

查看报告,Live datanodes 显示节点为4个。

[root@hadoop hadoop]# ./bin/hdfs dfsadmin -report
Configured Capacity: 822126559232 (765.67 GB)
Present Capacity: 798787727360 (743.93 GB)
DFS Remaining: 798786990080 (743.93 GB)
DFS Used: 737280 (720 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (4):

Name: 192.168.204.54:50010 (node04)
Hostname: node04
Decommission Status : Normal
Configured Capacity: 205531639808 (191.42 GB)
DFS Used: 4096 (4 KB)
Non DFS Used: 5658746880 (5.27 GB)
DFS Remaining: 199872888832 (186.15 GB)
DFS Used%: 0.00%
DFS Remaining%: 97.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 14 15:00:23 CST 2024


Name: 192.168.204.53:50010 (node03)
Hostname: node03
Decommission Status : Normal
Configured Capacity: 205531639808 (191.42 GB)
DFS Used: 266240 (260 KB)
Non DFS Used: 5621547008 (5.24 GB)
DFS Remaining: 199909826560 (186.18 GB)
DFS Used%: 0.00%
DFS Remaining%: 97.26%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 14 15:00:24 CST 2024


Name: 192.168.204.51:50010 (node01)
Hostname: node01
Decommission Status : Normal
Configured Capacity: 205531639808 (191.42 GB)
DFS Used: 180224 (176 KB)
Non DFS Used: 6029209600 (5.62 GB)
DFS Remaining: 199502249984 (185.80 GB)
DFS Used%: 0.00%
DFS Remaining%: 97.07%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 14 15:00:22 CST 2024


Name: 192.168.204.52:50010 (node02)
Hostname: node02
Decommission Status : Normal
Configured Capacity: 205531639808 (191.42 GB)
DFS Used: 286720 (280 KB)
Non DFS Used: 6029328384 (5.62 GB)
DFS Remaining: 199502024704 (185.80 GB)
DFS Used%: 0.00%
DFS Remaining%: 97.07%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 14 15:00:25 CST 2024

(16)查看命令

设置带宽命令为 -setBalancerBandwidth

[root@hadoop hadoop]# ./bin/hdfs dfsadmin
Usage: hdfs dfsadmin
Note: Administrative commands can only be run as the HDFS superuser.
        [-report [-live] [-dead] [-decommissioning]]
        [-safemode <enter | leave | get | wait>]
        [-saveNamespace]
        [-rollEdits]
        [-restoreFailedStorage true|false|check]
        [-refreshNodes]
        [-setQuota <quota> <dirname>...<dirname>]
        [-clrQuota <dirname>...<dirname>]
        [-setSpaceQuota <quota> [-storageType <storagetype>] <dirname>...<dirname>]
        [-clrSpaceQuota [-storageType <storagetype>] <dirname>...<dirname>]
        [-finalizeUpgrade]
        [-rollingUpgrade [<query|prepare|finalize>]]
        [-refreshServiceAcl]
        [-refreshUserToGroupsMappings]
        [-refreshSuperUserGroupsConfiguration]
        [-refreshCallQueue]
        [-refresh <host:ipc_port> <key> [arg1..argn]
        [-reconfig <datanode|...> <host:ipc_port> <start|status>]
        [-printTopology]
        [-refreshNamenodes datanode_host:ipc_port]
        [-deleteBlockPool datanode_host:ipc_port blockpoolId [force]]
        [-setBalancerBandwidth <bandwidth in bytes per second>]
        [-fetchImage <local directory>]
        [-allowSnapshot <snapshotDir>]
        [-disallowSnapshot <snapshotDir>]
        [-shutdownDatanode <datanode_host:ipc_port> [upgrade]]
        [-getDatanodeInfo <datanode_host:ipc_port>]
        [-metasave filename]
        [-triggerBlockReport [-incremental] <datanode_host:ipc_port>]
        [-help [cmd]]

Generic options supported are
-conf <configuration file>     specify an application configuration file
-D <property=value>            use value for given property
-fs <local|namenode:port>      specify a namenode
-jt <local|resourcemanager:port>    specify a ResourceManager
-files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.

The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]

(17)设置带宽平衡数据

000为KB,000000为MB,

500+000000 为500MB

[root@hadoop hadoop]# ./bin/hdfs dfsadmin -setBalancerBandwidth 500000000

执行脚本

[root@hadoop hadoop]# ./sbin/start-balancer.sh

(18)查看状态

DFS Used 为使用情况

[root@hadoop hadoop]# ./bin/hdfs dfsadmin -report
Configured Capacity: 822126559232 (765.67 GB)
Present Capacity: 798788423680 (743.93 GB)
DFS Remaining: 798787682304 (743.93 GB)
DFS Used: 741376 (724 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (4):

Name: 192.168.204.54:50010 (node04)
Hostname: node04
Decommission Status : Normal
Configured Capacity: 205531639808 (191.42 GB)
DFS Used: 8192 (8 KB)
Non DFS Used: 5658730496 (5.27 GB)
DFS Remaining: 199872901120 (186.15 GB)
DFS Used%: 0.00%
DFS Remaining%: 97.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 14 15:16:33 CST 2024


Name: 192.168.204.53:50010 (node03)
Hostname: node03
Decommission Status : Normal
Configured Capacity: 205531639808 (191.42 GB)
DFS Used: 266240 (260 KB)
Non DFS Used: 5620936704 (5.23 GB)
DFS Remaining: 199910436864 (186.18 GB)
DFS Used%: 0.00%
DFS Remaining%: 97.27%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 14 15:16:33 CST 2024


Name: 192.168.204.51:50010 (node01)
Hostname: node01
Decommission Status : Normal
Configured Capacity: 205531639808 (191.42 GB)
DFS Used: 180224 (176 KB)
Non DFS Used: 6029176832 (5.62 GB)
DFS Remaining: 199502282752 (185.80 GB)
DFS Used%: 0.00%
DFS Remaining%: 97.07%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 14 15:16:34 CST 2024


Name: 192.168.204.52:50010 (node02)
Hostname: node02
Decommission Status : Normal
Configured Capacity: 205531639808 (191.42 GB)
DFS Used: 286720 (280 KB)
Non DFS Used: 6029291520 (5.62 GB)
DFS Remaining: 199502061568 (185.80 GB)
DFS Used%: 0.00%
DFS Remaining%: 97.07%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Mar 14 15:16:34 CST 2024

二、问题

1.rsync 同步报错

(1)报错

(2)原因分析

同步主机名称错误。

(3)解决方法

修改同步主机名称。

[root@hadoop ~]# rsync -av /etc/hosts node01:/etc/

  • 32
    点赞
  • 23
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值