Clouderera SCM Server启动失败之pam_unix(sshd:session) session closed for user root分析定位

昨天在某客户环境进行CDH Hadoop的安装,安装还算比较顺利,但在启动Cloudera SCM Server和Agent服务的时候均启动失败。

[root@YXnode01 ~]# service cloudera-scm-server restart
Restarting cloudera-scm-server (via systemctl):  Job for cloudera-scm-server.service failed because the control process exited with error code. See "systemctl status cloudera-scm-server.service" and "journalctl -xe" for details.
                                                           [FAILED]

根据上述提示信息,我们执行"systemctl status cloudera-scm-server.service"查看详细错误信息如下,

[root@YXnode01 ~]# systemctl status cloudera-scm-server.service
● cloudera-scm-server.service - LSB: Cloudera SCM Server
   Loaded: loaded (/etc/rc.d/init.d/cloudera-scm-server; bad; vendor preset: disabled)
   Active: failed (Result: exit-code) since Tue 2019-11-05 09:25:49 CST; 3min 32s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 15982 ExecStart=/etc/rc.d/init.d/cloudera-scm-server start (code=exited, status=1/FAILURE)

Nov 05 09:25:44 YXnode01.esgyn.cn systemd[1]: Starting LSB: Cloudera SCM Server...
Nov 05 09:25:44 YXnode01.esgyn.cn su[16015]: pam_unix(su:auth): auth could not identify password for [cloudera-scm]
Nov 05 09:25:44 YXnode01.esgyn.cn su[16015]: pam_succeed_if(su:auth): requirement "uid >= 1000" not met by user "cloudera-scm"
Nov 05 09:25:46 YXnode01.esgyn.cn su[16015]: FAILED SU (to cloudera-scm) root on none
Nov 05 09:25:49 YXnode01.esgyn.cn cloudera-scm-server[15982]: Starting cloudera-scm-server: [FAILED]
Nov 05 09:25:49 YXnode01.esgyn.cn systemd[1]: cloudera-scm-server.service: control process exited, code=exited status=1
Nov 05 09:25:49 YXnode01.esgyn.cn systemd[1]: Failed to start LSB: Cloudera SCM Server.
Nov 05 09:25:49 YXnode01.esgyn.cn systemd[1]: Unit cloudera-scm-server.service entered failed state.
Nov 05 09:25:49 YXnode01.esgyn.cn systemd[1]: cloudera-scm-server.service failed.

顺便查看Cloudera SCM Server的日志,内容如下,

[root@YXnode01 ~]# tail -10f /var/log/cloudera-scm-server/cloudera-scm-server.out 
Password: su: Error in service module

检查Hadoop节点的selinux、防火墙、ssh等这些均正常,根据以上具体错误“pam_succeed_if(su:auth): requirement “uid >= 1000” not met by user “cloudera-scm””,我们怀疑可能是linux系统有什么特殊的安全策略,网上搜索一番找到阿里的一篇文章https://help.aliyun.com/knowledge_detail/41491.html?spm=a2c6h.13066369.0.0.2edd1479fTjQLg
根据上述文章内容,我们从目录/etc/pam.d下面搜索’uid >= 1000’相关内容,找到以下配置文件。

[root@YXnode01 pam.d]# grep 'uid >= 1000' *
password-auth:auth        requisite     pam_succeed_if.so uid >= 1000 quiet_success
password-auth-ac:auth        requisite     pam_succeed_if.so uid >= 1000 quiet_success
system-auth:auth        requisite     pam_succeed_if.so uid >= 1000 quiet_success
system-auth-ac:auth        requisite     pam_succeed_if.so uid >= 1000 quiet_success
[root@YXnode01 pam.d]# pwd
/etc/pam.d

于是我们注释掉上述相关的内容然后重试尝试启动SCM Server服务, 发现仍然启动失败,但报错信息略有不同,之前的错误pam_succeed_if(su:auth): requirement “uid >= 1000” not met by user "cloudera-scm"已经不存在,报错信息变成了FAILED SU (to cloudera-scm) root on none。

[root@YXnode01 ~]# systemctl status cloudera-scm-server.service
● cloudera-scm-server.service - LSB: Cloudera SCM Server
   Loaded: loaded (/etc/rc.d/init.d/cloudera-scm-server; bad; vendor preset: disabled)
   Active: failed (Result: exit-code) since Tue 2019-11-05 09:59:37 CST; 17s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 17469 ExecStart=/etc/rc.d/init.d/cloudera-scm-server start (code=exited, status=1/FAILURE)

Nov 05 09:59:32 YXnode01.esgyn.cn systemd[1]: Starting LSB: Cloudera SCM Server...
Nov 05 09:59:32 YXnode01.esgyn.cn su[17502]: pam_unix(su:auth): auth could not identify password for [cloudera-scm]
Nov 05 09:59:34 YXnode01.esgyn.cn su[17502]: FAILED SU (to cloudera-scm) root on none
Nov 05 09:59:37 YXnode01.esgyn.cn cloudera-scm-server[17469]: Starting cloudera-scm-server: [FAILED]
Nov 05 09:59:37 YXnode01.esgyn.cn systemd[1]: cloudera-scm-server.service: control process exited, code=exited status=1
Nov 05 09:59:37 YXnode01.esgyn.cn systemd[1]: Failed to start LSB: Cloudera SCM Server.
Nov 05 09:59:37 YXnode01.esgyn.cn systemd[1]: Unit cloudera-scm-server.service entered failed state.
Nov 05 09:59:37 YXnode01.esgyn.cn systemd[1]: cloudera-scm-server.service failed.

原来,使用root用户直接执行service cloudera-scm-server start时,内部会先切换到cloudera-scm用户进行启动,即启动时先执行su cloudera-scm命令。
于是我们检查从root切换到cloudea-scm用户,并在其他正常的环境中做同样的测试。我们发现在此环境里面root执行su cloudera-scm时会提示需要输入password,但在正常的环境中不需要。

[root@YXnode01 ~]# su cloudera-scm
Password: 

根据此信息,我们进一步搜索到需要检查/etc/pam.d/su文件,于是我们对比了此环境和正常环境中的/etc/pam.d/su文件,区别如下图所示,
在这里插入图片描述
在此环境中,上述文件多出一行,我们按照正常环境中的配置注释掉上述这一行,然后重新启动SCM Server服务,现在能够正常启动。

[root@YXnode01 ~]# service cloudera-scm-server status
● cloudera-scm-server.service - LSB: Cloudera SCM Server
   Loaded: loaded (/etc/rc.d/init.d/cloudera-scm-server; bad; vendor preset: disabled)
   Active: active (exited) since Tue 2019-11-05 11:29:54 CST; 15s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 19790 ExecStart=/etc/rc.d/init.d/cloudera-scm-server start (code=exited, status=0/SUCCESS)

Nov 05 11:29:49 YXnode01.esgyn.cn systemd[1]: Starting LSB: Cloudera SCM Server...
Nov 05 11:29:49 YXnode01.esgyn.cn su[19823]: (to cloudera-scm) root on none
Nov 05 11:29:54 YXnode01.esgyn.cn cloudera-scm-server[19790]: Starting cloudera-scm-server: [  OK  ]
Nov 05 11:29:54 YXnode01.esgyn.cn systemd[1]: Started LSB: Cloudera SCM Server.

再来研究一下,
auth required pam_wheel.so group=wheel,表示禁止非wheel组用户切换到root。
在Linux中为了更进一步加强系统的安全性,很有必要建立了一个管理员的组,只允许这个组的用户来执行“su -”命令登录为root用户,而让其他组的用户即使执行“su -”、输入了正确的root密码,也无法登录为root用户。在UNIX和Linux下,这个组的名称通常为“wheel”。而这个是在配置文件/etc/pam.d/su里面配置的。因此,这一个配置加到su文件里面,就导致了cloudera-scm用户与root无法进行su切换,除非把cloudera-scm用户加到wheel组。

  • 3
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

数据源的港湾

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值