问题重现:
/opt/cloudera/cm/schema/scm_prepare_database.sh -hhadoop01 --scm-host hadoop01 mysql scm scm 123456
执行后数据库中的scm库没有任何表和数据
原因分析
刚开始报这个错误,心想增加scm的对hadoop01.xx.com访问即可
重新执行:
/opt/cloudera/cm/schema/scm_prepare_database.sh -hhadoop01 --scm-host hadoop01 mysql scm scm 123456
执行之后没有报任务错误,但是数据库中scm没有初始化一个表,全是空的。
排除原因1:Mysql数据库连接和版本异常
心里很纳闷,脚本的语法没写错,如果是数据库连不上那应该报错啊,但是没有,于是还是去检查了一遍,使用mysql -root -p123456可以进入mysql数据库,于是排除了数据库连接的问题,那是什么原因?版本5.6去官网查询了也是支持的,而且5.6.46是我去mysql官网下载的5.6版本里面最稳定的,结论排除此原因。
排除原因2:官方文档查询资料
心里一直琢磨到底是什么原因导致,执行没报异常,但是数据库里没数据。从我的经验出发遇到这类问题,于是就去翻官网文档,不得不说cloudea文档写得很非常棒
https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/prepare_cm_database.html
此参数配置和官网说得一样,检查确认没问题
排除原因3:脚本日志和linux系统级日志
此时有点郁闷,都没问题到底是什么原因??
心中还是只有一个念头,就是错误日志信息太少,我想打印些日志信息也参数配置。于是想到自己的linux系统是CenOS7的。有系统日志命令,于是打印查看
journalctl -xe |
[root@hadoop01 ~]# journalctl -xe
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Withdrawing address record for 192.168.122.1 on virbr0.
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Withdrawing workstation service for virbr0.
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Withdrawing address record for fe80::44a6:91bd:cc0a:c335 on ens192.
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Withdrawing address record for fe80::3c7e:ac7c:5452:edc6 on ens192.
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Host name conflict, retrying with hadoop01-8040
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Registering new address record for 192.168.122.1 on virbr0.IPv4.
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Registering new address record for fe80::44a6:91bd:cc0a:c335 on ens192.*.
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Registering new address record for fe80::3c7e:ac7c:5452:edc6 on ens192.*.
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Registering new address record for fe80::cc18:552d:c34f:830b on ens192.*.
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Registering new address record for 192.168.12.101 on ens192.IPv4.
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Registering HINFO record with values 'X86_64'/'LINUX'.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing address record for 192.168.122.1 on virbr0.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing address record for fe80::44a6:91bd:cc0a:c335 on ens192.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing address record for fe80::3c7e:ac7c:5452:edc6 on ens192.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Registering new address record for 192.168.122.1 on virbr0.IPv4.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Registering new address record for fe80::44a6:91bd:cc0a:c335 on ens192.*.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Registering new address record for fe80::cc18:552d:c34f:830b on ens192.*.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Server startup complete. Host name is hadoop01-8040.local. Local service cookie is 1979383840
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing address record for fe80::cc18:552d:c34f:830b on ens192.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing address record for 192.168.12.101 on ens192.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing workstation service for ens192.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing workstation service for lo.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing workstation service for virbr0-nic.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing address record for 192.168.122.1 on virbr0.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing workstation service for virbr0.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing address record for fe80::44a6:91bd:cc0a:c335 on ens192.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Host name conflict, retrying with hadoop01-8041
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Registering new address record for 192.168.122.1 on virbr0.IPv4.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Registering new address record for fe80::44a6:91bd:cc0a:c335 on ens192.*.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Registering new address record for fe80::3c7e:ac7c:5452:edc6 on ens192.*.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Registering new address record for fe80::cc18:552d:c34f:830b on ens192.*.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Registering new address record for 192.168.12.101 on ens192.IPv4.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Registering HINFO record with values 'X86_64'/'LINUX'.
Oct 28 09:20:01 hadoop01 systemd[1]: Started Session 482 of user root.
-- Subject: Unit session-482.scope has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit session-482.scope has finished starting up.
--
-- The start-up result is done.
Oct 28 09:20:01 hadoop01 CROND[56675]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Oct 28 09:20:16 hadoop01 CommAmqpListene[56536]: [CCafException] AmqpComm@[56536]: CommAmqpListener: [CCafException] AmqpCommon::validateSt
lines 987-1029/1029 (END)
突然很高兴,因为这里说得很清楚,
Host name conflict, retrying with hadoop01-8040
主机名冲突,我配置的是主机名初始化的,于是先改成ip测试下,
同时把这个问题抛给运维人员(此运维人员没有open的思想,问他了也不说:但是我执行了history去查看历史的命令他没有用过,那我估计他虚拟化主机的时候,虚拟机ip的配置及映射有问题)
果然执行之后数据库里面的scm库有数据了,此时笑了出来,妈蛋原来是主机环境问题。
于是回过来把环境都检查确认了一遍
等运维把主机名冲突的事情解决了,我重新执行了带主机名的脚本,这下正常了
总结:
下次安装集群之前,首先应该把主机环境都确认清楚。不能有一点问题,否则事倍功半。
干技术要有open的思想,互相学习进步