大数据集群修改服务器ip

【背景】

因为下周要对大数据开放式平台的服务器进行机房搬迁,开放式平台有90台物理机,其中24台服务器是后来扩容新增的,ip段为19.126.66.*,与另外一个集群共用了同一个网段。根据机房的物理部署规划,搬迁是要对同一个网段批量进行的,因此在搬迁前需要对这24台服务器的ip进行修改。

修改ip的变更本周四实施,因此今天在测试环境进行方案验证,对一台计算节点进行ip修改。源ip:146.32.19.25,目标ip:146.32.18.100

【零、CM上停止该节点角色】

【一、修改ip配置文件】

将老的ip配置文件移动到/tmp目录下

d0305001:/etc/sysconfig/network # cat ifcfg-vlan119

BOOTPROTO='static'

BROADCAST=''

ETHERDEVICE='bond0'

ETHTOOL_OPTIONS=''

IPADDR='146.32.19.25/24'

MTU=''

NAME=''

NETWORK=''

REMOTE_IPADDR=''

STARTMODE='auto'

USERCONTROL='no'

VLAN_ID='119'

d0305001:/etc/sysconfig/network # mv ifcfg-vlan119 /tmp/

新建ip配置文件ifcfg-vlan118

d0305001:/etc/sysconfig/network # cat ifcfg-vlan118

BOOTPROTO='static'

BROADCAST=''

ETHERDEVICE='bond0'

ETHTOOL_OPTIONS=''

IPADDR='146.32.18.100/24'

MTU=''

NAME=''

NETWORK=''

REMOTE_IPADDR=''

STARTMODE='auto'

USERCONTROL='no'

VLAN_ID='118'

【二、修改路由配置文件】

d0305001:/etc/sysconfig/network # vi routes

default 146.32.19.254 - -

修改为

default 146.32.18.254 - -

【三、重启网络服务】

service network restart

d0305001:/etc/sysconfig/network # ip a|grep global

    inet 146.32.18.100/24 brd 146.32.18.255 scope global vlan118

    inet 146.33.18.100/24 brd 146.33.18.255 scope global vlan218

【四、修改ntp配置】

修改/etc/ntp.conf文件中的146.32.19.254网关地址为新ip对应的网关146.32.18.254,并重启ntp服务

d0305001:/etc/sysconfig/network # service ntp restart

Shutting down network time protocol daemon (NTPD)                                                                        done

Starting network time protocol daemon (NTPD)                                                                            done

【五、修改整个集群、客户端的/etc/hosts文件】

cp /etc/hosts /etc/hosts.0107

sed -i 's/19.25/18.100/g' /etc/hosts

【六、重启该节点agent服务】

service cloudera-scm-agent restart

cm上启动该节点角色

【七、验证】

该节点角色服务启动后,报错失去namenode连接

查看日志:Datanode denied communication with namenode because the host is not in the include-list: DatanodeRegistration(146.32.18.100……

d0305001:/var/log/hadoop-hdfs # tail -100 hadoop-cmf-hdfs-DATANODE-d0305001.log.out

        at javax.security.auth.Subject.doAs(Subject.java:415)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1714)

        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2135)

2019-01-07 10:50:21,141 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-1060838331-146.249.31.13-1489136106065 (Datanode Uuid 5538360a-f138-42f2-b219-2b4993c6de2a) service to d0305004/146.32.19.28:8022 beginning handshake with NN

2019-01-07 10:50:21,143 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool BP-1060838331-146.249.31.13-1489136106065 (Datanode Uuid 5538360a-f138-42f2-b219-2b4993c6de2a) service to d0305004/146.32.19.28:8022 Datanode denied communication with namenode because the host is not in the include-list: DatanodeRegistration(146.32.18.100, datanodeUuid=5538360a-f138-42f2-b219-2b4993c6de2a, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=cluster14;nsid=314642609;c=0)

        at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:915)

        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:5143)

        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1162)

        at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:100)

        at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:29184)

        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)

        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2141)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:415)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1714)

        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2135)

2019-01-07 10:50:21,151 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-1060838331-146.249.31.13-1489136106065 (Datanode Uuid 5538360a-f138-42f2-b219-2b4993c6de2a) service to d0305005/146.32.19.29:8022 beginning handshake with NN

2019-01-07 10:50:21,152 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool BP-1060838331-146.249.31.13-1489136106065 (Datanode Uuid 5538360a-f138-42f2-b219-2b4993c6de2a) service to d0305005/146.32.19.29:8022 Datanode denied communication with namenode because the host is not in the include-list: DatanodeRegistration(146.32.18.100, datanodeUuid=5538360a-f138-42f2-b219-2b4993c6de2a, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=cluster14;nsid=314642609;c=0)

        at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:915)

        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:5143)

        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1162)

        at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:100)

        at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:29184)

        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)

        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2141)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:415)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1714)

        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2135)

问题原因:造成此问题的原因大部分是由于主机IP方面有问题。报错上面提示的是与active namenode拒绝连接。       

登陆namenode节点,

d0305004:~ # find / -name *allow.txt

find: `/proc/29004': No such file or directory

/var/run/cloudera-scm-agent/process/6938-yarn-RESOURCEMANAGER-refresh/nodes_allow.txt

/var/run/cloudera-scm-agent/process/6930-namenodes-failover/dfs_hosts_allow.txt

/var/run/cloudera-scm-agent/process/6929-hdfs-NAMENODE-safemode-wait/dfs_hosts_allow.txt

/var/run/cloudera-scm-agent/process/6927-hdfs-NAMENODE-nnRpcWait/dfs_hosts_allow.txt

/var/run/cloudera-scm-agent/process/6926-hdfs-NAMENODE/dfs_hosts_allow.txt

/var/run/cloudera-scm-agent/process/6924-hdfs-NAMENODE-jnSyncWait/dfs_hosts_allow.txt

/var/run/cloudera-scm-agent/process/6920-hdfs-NAMENODE-jnSyncWait/dfs_hosts_allow.txt

/var/run/cloudera-scm-agent/process/6916-hdfs-NAMENODE-jnSyncWait/dfs_hosts_allow.txt

/var/run/cloudera-scm-agent/process/6416-yarn-RESOURCEMANAGER/nodes_allow.txt

/var/run/cloudera-scm-agent/process/6368-hdfs-NAMENODE/dfs_hosts_allow.txt

d0305004:~ # cat /var/run/cloudera-scm-agent/process/6368-hdfs-NAMENODE/dfs_hosts_allow.txt

146.33.19.13

146.32.19.14

146.32.19.15

146.32.19.16

146.32.19.17

146.32.19.18

146.32.19.20

146.32.19.22

146.32.19.23

146.32.19.24

146.32.19.25

146.32.19.26

146.32.19.27

146.32.19.28

146.32.19.30

由于报错上面提示的是与active namenode拒绝连接,所以手动在namenode节点上刷新主机列表:

hadoop dfsadmin -fs hdfs://146.32.19.28:8020 -refreshNodes   //其中146.32.19.28是active namenode的IP

此时在active namenode也是能看到新添加的机器的:

d0305004:~ # cd /var/run/cloudera-scm-agent/process/

d0305004:/var/run/cloudera-scm-agent/process # ls -ltr

total 0

drwxr-x--x 3 zookeeper zookeeper 280 Nov  1 15:58 6346-zookeeper-server

drwxr-x--x 3 hdfs      hdfs      360 Nov  1 15:59 6353-hdfs-DATANODE

drwxr-x--x 3 hbase    hbase    360 Nov  1 16:00 6372-hbase-MASTER

drwxr-x--x 3 yarn      hadoop    440 Nov  1 16:00 6407-yarn-NODEMANAGER

drwxr-x--x 3 yarn      hadoop    500 Nov  1 16:00 6416-yarn-RESOURCEMANAGER

drwxr-x--x 5 solr      solr      280 Nov  1 16:00 6400-solr-SOLR_SERVER

drwxr-x--x 4 hive      hive      340 Nov  1 16:00 6417-hive-HIVESERVER2

drwxr-x--x 4 hive      hive      300 Nov  1 16:00 6418-hive-HIVEMETASTORE

drwxr-xr-x 4 root      root      100 Nov 12 11:01 ccdeploy_hadoop-conf_etchadoopconf.cloudera.yarn_6266238222486433408

drwxr-xr-x 4 root      root      100 Nov 12 11:02 ccdeploy_hive-conf_etchiveconf.cloudera.hive_-1465732137655581486

drwxr-x--x 3 root      root      140 Nov 14 12:11 6533-host-inspector

drwxr-x--x 4 root      root      140 Nov 14 12:11 6511-collect-host-statistics

drwxr-x--x 3 root      root      140 Nov 21 12:12 6581-host-inspector

drwxr-x--x 4 root      root      140 Nov 21 12:12 6559-collect-host-statistics

drwxr-x--x 3 root      root      140 Nov 28 12:13 6629-host-inspector

drwxr-x--x 4 root      root      140 Nov 28 12:13 6607-collect-host-statistics

drwxr-x--x 3 root      root      140 Dec  5 12:14 6677-host-inspector

drwxr-x--x 4 root      root      140 Dec  5 12:14 6655-collect-host-statistics

drwxr-x--x 3 root      root      140 Dec 12 12:15 6726-host-inspector

drwxr-x--x 4 root      root      140 Dec 12 12:15 6704-collect-host-statistics

drwxr-x--x 3 root      root      140 Dec 19 12:16 6774-host-inspector

drwxr-x--x 4 root      root      140 Dec 19 12:16 6752-collect-host-statistics

drwxr-x--x 3 root      root      140 Dec 26 12:17 6822-host-inspector

drwxr-x--x 4 root      root      140 Dec 26 12:17 6800-collect-host-statistics

drwxr-x--x 3 root      root      140 Jan  2 12:18 6870-host-inspector

drwxr-x--x 4 root      root      140 Jan  2 12:18 6848-collect-host-statistics

drwxr-x--x 3 hdfs      hdfs      340 Jan  7 10:54 6355-hdfs-JOURNALNODE

drwxr-x--x 3 hdfs      hdfs      320 Jan  7 10:54 6917-hdfs-JOURNALNODE

drwxr-x--x 3 hdfs      hdfs      500 Jan  7 10:54 6916-hdfs-NAMENODE-jnSyncWait

drwxr-x--x 3 hdfs      hdfs      500 Jan  7 10:55 6920-hdfs-NAMENODE-jnSyncWait

drwxr-x--x 3 hdfs      hdfs      500 Jan  7 10:55 6924-hdfs-NAMENODE-jnSyncWait

drwxr-x--x 3 hdfs      hdfs      500 Jan  7 10:55 6368-hdfs-NAMENODE

drwxr-x--x 3 hdfs      hdfs      480 Jan  7 10:55 6926-hdfs-NAMENODE

drwxr-x--x 3 hdfs      hdfs      480 Jan  7 10:56 6927-hdfs-NAMENODE-nnRpcWait

drwxr-x--x 3 hdfs      hdfs      380 Jan  7 10:56 6362-hdfs-FAILOVERCONTROLLER

drwxr-x--x 3 hdfs      hdfs      360 Jan  7 10:56 6928-hdfs-FAILOVERCONTROLLER

drwxr-x--x 3 hdfs      hdfs      480 Jan  7 10:56 6929-hdfs-NAMENODE-safemode-wait

drwxr-x--x 3 hdfs      hdfs      480 Jan  7 10:57 6930-namenodes-failover

drwxr-xr-x 4 root      root      100 Jan  7 10:57 ccdeploy_hadoop-conf_etchadoopconf.cloudera.hdfs_1239954674294922633

drwxr-xr-x 4 root      root      120 Jan  7 10:57 ccdeploy_hadoop-conf_etchadoopconf.cloudera.hdfs_2490906708984413108

drwxr-x--x 3 yarn      hadoop    500 Jan  7 11:00 6938-yarn-RESOURCEMANAGER-refresh

d0305004:/var/run/cloudera-scm-agent/process # cd 6926-hdfs-NAMENODE

d0305004:/var/run/cloudera-scm-agent/process/6926-hdfs-NAMENODE # ls

cloudera-monitor.properties                  dfs_hosts_exclude.txt      http-auth-signature-secret  ssl-server.xml

cloudera-stack-monitor.properties            event-filter-rules.json    log4j.properties            supervisor.conf

cloudera_manager_agent_fencer.py              hadoop-metrics2.properties  logs                        topology.map

cloudera_manager_agent_fencer_secret_key.txt  hadoop-policy.xml          navigator.client.properties  topology.py

core-site.xml                                hdfs-site.xml              redaction-rules.json

dfs_hosts_allow.txt                          hdfs.keytab                ssl-client.xml

d0305004:/var/run/cloudera-scm-agent/process/6926-hdfs-NAMENODE # cat dfs_hosts_allow.txt

146.33.19.13

146.32.19.14

146.32.19.15

146.32.19.16

146.32.19.17

146.32.19.18

146.32.19.20

146.32.19.22

146.32.19.23

146.32.19.24

146.32.18.100

146.32.19.26

146.32.19.27

146.32.19.28

146.32.19.30

d0305004:/var/run/cloudera-scm-agent/process/6926-hdfs-NAMENODE #

最后一步,在cm页面上执行刷新集群操作就可以了。

当然,还有一个简单一点的办法,那就是滚动重启hdfs服务喽,只要没业务就行~

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值