cdh集群mysql数据库宕机_那些安装CDH集群过程中踩过的坑......(比较全)

一、登录Cloudera Manager (http://192.168.201.128:7180/cmf/login)时,无法访问web页面

针对此问题网上有较多的解决方案(e.g. https://www.cnblogs.com/zlslch/p/7078119.html), 如果还不能解决你的问题,请看下面的解决方案。

登录MySQL数据库(或利用Navicat),会发现有一个mysql数据库(下图所示),在mysql数据库中有一个user表,将User="root"的两条记录进行删除

select * from user;delete from user where User='root';

43fb6eede9c198a4c817bc7da1f15ea3.png

8d147141d7eefe5af856d77a2e2db23d.png

再次登录http://192.168.201.128:7180/cmf/login,发现登录成功!

0663f6a90b41c16bf2f0d91d54aa8d46.png

二、利用Navicat连接MySql数据库时,错误信息:Can't connect to MySQL server on 'xxxxx'(10038)

0cf79643aa8684e0418fd924e9da3cae.png

解决方案:

查看网络的端口信息:netstat -ntpl,下图状态为正常状态(不是请进行如下操作),如果没有netstat,在CentOS 7下利用yum -y install net-tools进行安装。

5ca66baa2a09d7e187965de1fb145284.png

查看防火墙的状态,发现3306的端口是丢弃状态:

iptables -vnL

这里要清除防火墙中链中的规则

iptables -F

再次连接MySql数据库,发现连接成功!

f2e3ed5a8a31f7df86c7b7e1bd610662.png

三、无法启动NameNode,查看日志发现如下错误...

Exception in thread "main" org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot delete /tmp/hadoop-yarn/staging/hadoop/.staging/job_1490689337938_0001. Name node is in safe mode.

The reported blocks48 needs additional 5 blocks to reach the threshold 0.9990 of total blocks 53.

The number of live datanodes2 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1327)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3713)

at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:953)

at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:611)

at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2045)

什么是安全模式?

安全模式是HDFS所处的一种特殊状态,在这种状态下,文件系统只接受读数据请求,而不接受删除、修改等变更请求。在NameNode主节点启动时,HDFS首先进入安全模式,DataNode在启动的时候会向namenode汇报可用的block等状态,当整个系统达到安全标准时,HDFS自动离开安全模式。如果HDFS出于安全模式下,则文件block不能进行任何的副本复制操作,因此达到最小的副本数量要求是基于datanode启动时的状态来判定的,启动时不会再做任何复制(从而达到最小副本数量要求)原博文:https://blog.csdn.net/bingduanlbd/article/details/51900512。

1、集群升级维护时手动进入安全模式

hadoop dfsadmin -safemode enter

2、退出安全模式:

hadoop dfsadmin -safemode leave3、返回安全模式是否开启的信息

hadoop dfsadmin -safemode get

因此,当发现namenode处于安全模式,无法启动时,可以使用hadoop dfsadmin -safemode leave退出安全模式,重启namenode解决问题!

四、INFO hdfs.DFSClient: Exception in createBlockOutputStream  java.net.NoRouteToHostException: No route to host

16/07/27 01:29:26INFO hdfs.DFSClient: Exception in createBlockOutputStream

java.net.NoRouteToHostException: No route to host

at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)

at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)

at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)

at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1537)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1313)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1266)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)16/07/27 01:29:26 INFO hdfs.DFSClient: Abandoning BP-555863411-172.16.95.100-1469590594354:blk_1073741825_100116/07/27 01:29:26 INFO hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[172.16.95.101:50010,DS-ee00e1f8-5143-4f06-9ef8-b0f862fce649,DISK]16/07/27 01:29:26INFO hdfs.DFSClient: Exception in createBlockOutputStream

java.net.NoRouteToHostException: No route to host

at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)

at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)

at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)

at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1537)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1313)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1266)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)16/07/27 01:29:26 INFO hdfs.DFSClient: Abandoning BP-555863411-172.16.95.100-1469590594354:blk_1073741826_100216/07/27 01:29:26 INFO hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[172.16.95.102:50010,DS-eea51eda-0a07-4583-9eee-acd7fc645859,DISK]16/07/27 01:29:26WARN hdfs.DFSClient: DataStreamer Exception

org.apache.hadoop.ipc.RemoteException(java.io.IOException): File/wc/mytemp/123._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 2 datanode(s) running and 2 node(s) are excluded in thisoperation.

at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1547)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3107)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3031)

at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:724)

at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)

at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

at org.apache.hadoop.ipc.Client.call(Client.java:1475)

at org.apache.hadoop.ipc.Client.call(Client.java:1412)

at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)

at com.sun.proxy.$Proxy9.addBlock(Unknown Source)

at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:497)

at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)

at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)

at com.sun.proxy.$Proxy10.addBlock(Unknown Source)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1459)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1255)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)

put: File/wc/mytemp/123._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 2 datanode(s) running and 2 node(s) are excluded in thisoperation.

[hadoop@master bin]$ service firewall

The service command supports only basic LSB actions (start, stop, restart,try-restart, reload, force-reload, status). For other actions, please try to use systemctl.

检查防火墙是否关闭,如果没有关闭将所有节点的防火墙进行关闭。在CentOS 6中命令 service iptables stop,在CentOS 7中命令 service firewalld stop

检查所有的主机,/etc/selinux/config下的SELINUX,设置SELINUX=disabled。

再检测上述问题即可解决。为了防止防火墙开机重启,执行命令systemctl disable firewalld.service

五、hadoop 运行出错,发现是ClusterId不一致问题

进入/dfs/nn/current,利用cat VERSION,查看确认各节点的clusterID是否一致

如果不一致,将主节点的clusterID进行拷贝,并修改各不一致的子节点的clusterID,保存退出,即可解决问题!

74872d89a8ed06cd3e6e032aad1ed2d0.png

六、解决SecureCRT等软件连接Linux速度缓慢问题,(有时出现 The semaphore timeout period has expired)

编辑sshd_config文件 ---->vi /etc/ssh/sshd_config

在文件ssh_config中添加如下代码,并保存退出,重启service sshd restart 或重启 reboot 即可。

两种情况类似一块儿处理了.......,避免再出现问题

UseDns no

ClientAliveInterval 60

7799f19710ff7f40ac46eeca0f23efdd.png

七、登录MySQL5.7时出现ERROR 1045 (28000): Access denied for user 'root'@'localhost'

使用vi /etc/my.cnf,打开mysql配置文件,在文件中[mysqld]加入代码 skip-grant-tables, 退出并保存。

使用service mysql restart, 重启MySQL服务

然后再次进入到终端当中,敲入mysql -u root -p命令然后回车,当需要输入密码时,直接按enter键,便可以不用密码登录到数据库当中

update mysql.user set authentication_string=password('123456') where user='root';

再次进入到之前的配置文件中,将代码skip-grant-tables进行删除即可。

1829e6bb6efe3f174c982c5d6c13cd68.png

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值