问题描述
今天在用ansible去部署组件的时候发现以下报错:
Failed to connect to the host via ssh: ssh_exchange_identification: read: Connection reset by peer
问题排查
首先下意识的ping了一下目标机,发现是通的,然后试着ssh目标机,报相同的错,所以网络是没问题的,问题应该在应用层上,随后查看了目标机配置。
vi /etc/hosts.deny
#
# hosts.deny This file contains access rules which are used to
# deny connections to network services that either use
# the tcp_wrappers library or that have been
# started through a tcp_wrappers-enabled xinetd.
#
# The rules in this file can also be set up in
# /etc/hosts.allow with a 'deny' option instead.
#
# See 'man 5 hosts_options' and 'man 5 hosts_access'
# for information on rule syntax.
# See 'man tcpd' for information on tcp_wrappers
#
然后添加允许访问,再试试。
vi /etc/hosts.allow
#
# hosts.allow This file contains access rules which are used to
# allow or deny connections to network services that
# either use the tcp_wrappers library or that have been
# started through a tcp_wrappers-enabled xinetd.
#
# See 'man 5 hosts_options' and 'man 5 hosts_access'
# for information on rule syntax.
# See 'man tcpd' for information on tcp_wrappers
#
sshd: ALL
重启服务后结果还是不行,于是试着debug一下。
ssh -v 10.0.223.115
OpenSSH_6.6.1, OpenSSL 1.0.1e-fips 11 Feb 2013
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 56: Applying options for *
debug1: Connecting to 10.0.223.115 [10.0.223.115] port 22.
debug1: Connection established.
debug1: permanently_set_uid: 0/0
debug1: identity file /root/.ssh/id_rsa type 1
debug1: identity file /root/.ssh/id_rsa-cert type -1
debug1: identity file /root/.ssh/id_dsa type -1
debug1: identity file /root/.ssh/id_dsa-cert type -1
debug1: identity file /root/.ssh/id_ecdsa type -1
debug1: identity file /root/.ssh/id_ecdsa-cert type -1
debug1: identity file /root/.ssh/id_ed25519 type -1
debug1: identity file /root/.ssh/id_ed25519-cert type -1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_6.6.1
ssh_exchange_identification: read: Connection reset by peer
service sshd stop
/usr/sbin/sshd -d
debug1: sshd version OpenSSH_6.6.1, OpenSSL 1.0.1e-fips 11 Feb 2013
debug1: key_parse_private2: missing begin marker
debug1: read PEM private key done: type RSA
debug1: private host key: #0 type 1 RSA
debug1: key_parse_private2: missing begin marker
debug1: read PEM private key done: type ECDSA
debug1: private host key: #1 type 3 ECDSA
debug1: private host key: #2 type 4 ED25519
debug1: rexec_argv[0]='/usr/sbin/sshd'
debug1: rexec_argv[1]='-d'
Set /proc/self/oom_score_adj from 0 to -1000
debug1: Bind to port 22 on 0.0.0.0.
Server listening on 0.0.0.0 port 22.
debug1: Bind to port 22 on ::.
Server listening on :: port 22.
依然没什么发现,于是想重启,然后重新跑一下看错误,结果ssh起不来了。
service sshd start
Redirecting to /bin/systemctl start sshd.service
Job for sshd.service failed because the control process exited with error code. See "systemctl status sshd.service" and "journalctl -xe" for details.
systemctl status sshd.service
● sshd.service - OpenSSH server daemon
Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: enabled)
Active: activating (auto-restart) (Result: exit-code) since 五 2022-01-07 11:09:53 CST; 16s ago
Docs: man:sshd(8)
man:sshd_config(5)
Process: 11151 ExecStart=/usr/sbin/sshd -D $OPTIONS (code=exited, status=1/FAILURE)
Main PID: 11151 (code=exited, status=1/FAILURE)
1月 07 11:09:53 oraclef systemd[1]: Failed to start OpenSSH server daemon.
1月 07 11:09:53 oraclef systemd[1]: Unit sshd.service entered failed state.
1月 07 11:09:53 oraclef systemd[1]: sshd.service failed.
journalctl -ex
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit sshd.service has begun shutting down.
1月 07 11:09:13 oraclef systemd[1]: Starting OpenSSH server daemon...
-- Subject: Unit sshd.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit sshd.service has begun starting up.
1月 07 11:09:13 oraclef sshd[11116]: Couldn't open /dev/null: Permission denied
1月 07 11:09:13 oraclef systemd[1]: sshd.service: main process exited, code=exited, status=1/FAILURE
1月 07 11:09:13 oraclef systemd[1]: Failed to start OpenSSH server daemon.
-- Subject: Unit sshd.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit sshd.service has failed.
发现是/dev/null没有权限。
cd /dev/
ll null
crw-r--r--. 1 root root 1, 3 1月 7 10:30 null
chmod 666 null
ll null
crw-rw-rw-. 1 root root 1, 3 1月 7 10:30 null
给到权限后ssh是起来了,但是问题还没解决,这时发现文件权限后面有个点,于是怀疑是SELINUX设置的原因,于是将SELINUX关闭。
getenforce
Enforcing
setenforce 0
getenforce
Permissive
这时重新去连接。问题就很明显了。
ssh root@10.0.223.115
The authenticity of host '10.0.223.115 (10.0.223.115)' can't be established.
ECDSA key fingerprint is 2b:d8:c9:ae:c7:0e:41:5e:e0:8b:05:11:c2:eb:18:5b.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.0.223.115' (ECDSA) to the list of known hosts.
root@10.0.223.115's password:
PTY allocation request failed on channel 0
问题原因
现在问题是PTY分配失败,这台机器由于在前期做过一个磁盘挂载,而挂载位置正是在/dev/下,当时只是对该目录下的文件进行了一下备份,没有太关注文件关联的一些情况,才导致的,所以要想解决当前的ssh的问题,只需要重新将原来的盘挂一下就可以了,如下:
umount /dev/pts
mount devpts /dev/pts -t devpts
但我这边情况比较复杂,需要对/dev/的空间进行重新的划分,所以我必须对机器重新进行存储方面的一些操作,才能保证以后机器的一个正常使用,所以在这就不再赘述以下的内容了。