记录一次开发错误定位问题,在比较早之前碰到过这个问题,当时选择了回避,使用 paramiko 代替这个 remoto 模块。今天又碰到了这个问题,出于学习目的,打算认真研究这个问题,通过翻看源码,发现是自己的疏忽大意,忽略一个参数。故记录此次学习记录,以此为戒!
1. 问题背景
在 ceph-deploy 工具中我接触到了一个远程执行命令的好工具:remoto。但是后续在使用该模块进行编码时,遇到了一个问题,下面来仔细描述下这个问题。
具体环境:
三台服务器:R10-P01-DN-001.gd.cn、R10-P01-DN-002.gd.cn、R10-P01-DN-002.gd.cn,其中 01 为主节点,对 01~03的节点都 ssh 免密。目前在 01 节点安装了 python3 而 02~03 只有 python2。
问题重现:
# 在python3中执行,访问001节点,正常
[store@R10-P01-DN-001 redis-agent]$ python3
Python 3.6.10 (default, Jun 19 2020, 10:51:42)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import remoto
>>> from remoto.process import check
>>> conn = remoto.Connection('R10-P01-DN-001.gd.cn')
>>> check(conn, ['hostname'])
INFO:R10-P01-DN-001.gd.cn:Running command: hostname
(['R10-P01-DN-001.gd.cn'], [], 0)
# 在 python3中执行,访问002或者003,异常
>>> conn = remoto.Connection('R10-P01-DN-002.gd.cn')
bash: python3: command not found
ERROR:R10-P01-DN-001.gd.cn:Can't communicate with remote host, possibly because python3 is not installed there
Traceback (most recent call last):
File "/opt/python3.6/lib/python3.6/site-packages/execnet/gateway_base.py", line 997, in _send
message.to_io(self._io)
File "/opt/python3.6/lib/python3.6/site-packages/execnet/gateway_base.py", line 443, in to_io
io.write(header + self.data)
File "/opt/python3.6/lib/python3.6/site-packages/execnet/gateway_base.py", line 410, in write
self.outfile.flush()
BrokenPipeError: [Errno 32] Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/python3.6/lib/python3.6/site-packages/remoto/backends/__init__.py", line 35, in __init__
self.gateway = self._make_gateway(hostname)
File "/opt/python3.6/lib/python3.6/site-packages/remoto/backends/__init__.py", line 48, in _make_gateway
gateway.reconfigure(py2str_as_py3str=False, py3str_as_py2str=False)
File "/opt/python3.6/lib/python3.6/site-packages/execnet/gateway.py", line 72, in reconfigure
self._send(Message.RECONFIGURE, data=data)
File "/opt/python3.6/lib/python3.6/site-packages/execnet/gateway_base.py", line 1003, in _send
raise IOError("cannot send (already closed?)")
OSError: cannot send (already closed?)
# python2中正常,因为对端主机也有python2
[store@R10-P01-DN-001 redis-agent]$ python
Python 2.7.5 (default, Apr 2 2020, 01:29:16)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import remoto
>>> from remoto.process import check
>>> conn = remoto.Connection('R10-P01-DN-002.gd.cn')
>>> check(conn