最近公司需要搭建Gp数据库作为数据仓库用于统计分析。第一次接触安装GP,在自己虚拟机(没有修改ssh默认端口)上yum安装一点点问题都没有。公司是内网开发,服务器没有互联网,而且SSH的默认22端口被修改为其他端口了。本人大概理解因为安全原因吧,反正ssh 的22端口不能使用。
大家在gp安装时应该有这两个步骤:
1.因为gp属于分布式数据库,大部分是集群安装使用,all_hosts_file文件是集群中所有主机的主机名。gpssh-exkeys 命令主要是用各主机间权限互通,用于免密登录。默认是使用linux的ssh服务,其中走的就是默认22端口。
于是你可能会想到修改ssh的默认端口,即/etc/ssh/sshd_config配置文件port参数,然而并没有卵用。后面陆续尝试了端口转发,使用别名方式都无效。
gpssh-exkeys -f all_hosts_file
当你之前这个命令后可能出现的结果是
[STEP 1 of 5] create local ID and authorize on local host
... /home/gpdata/.ssh/id_rsa file exists ... key generation skipped
[ERROR china-vm-0000000955] authentication check failed:
ssh: connect to host china-chen port 22: Connection refused
[ERROR] cannot establish ssh access into the local host
这是因为你的服务器ssh默认端口改了,此时你需要修改gp的gpssh-exkeys命令的脚本了。
def testAccess(hostname):
'''
Ensure the proper password-less access to the remote host.
Using ssh here also allows discovery of remote host keys *not*
reported by ssh-keyscan.
'''
errfile = os.path.join(tempDir, 'sshcheck.err')
cmd = 'ssh -p 33822 -o "BatchMode=yes" -o "StrictHostKeyChecking=no" %s true 2>%s' % (hostname, errfile)
if GV.opt['-v']: print '[INFO %s]: %s' % (hostname, cmd)
rc = os.system(cmd)
if rc != 0:
print >> sys.stderr, '[ERROR %s] authentication check failed:' % hostname
with open(errfile) as efile:
for line in efile:
print >> sys.stderr, ' ', line.rstrip()
return False
return True
######################
# step 0
#
# Ensure the local host can password-less ssh into each remote host
for remoteHost in GV.allHosts:
cmd = ['ssh','-p','33822', 'gpadmin@'+remoteHost.host(), '-o', 'BatchMode=yes', '-o', 'StrictHostKeyChecking=yes', 'true']
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = p.communicate()
if p.returncode:
print >> sys.stderr, '[ERROR]: Failed to ssh to %s. %s' % (remoteHost.host(), stderr)
print >> sys.stderr, '[ERROR]: Expected passwordless ssh to host %s' % remoteHost.host()
sys.exit(1)
cmd = ('scp -P 33822 -q -o "BatchMode yes" -o "NumberOfPasswordPrompts 0" ' +
'%s %s %s %s %s:.ssh/ 2>&1'
% (remoteAuthKeysFile,
remoteKnownHostsFile,
remoteIdentity,
remoteIdentityPub,
canonicalize(h.host())))
h.popen(cmd)
for h in GV.newHosts:
cmd = ('scp -P 33822 -q -o "BatchMode yes" -o "NumberOfPasswordPrompts 0" ' +
'%s %s %s %s %s:.ssh/ 2>&1'
% (GV.authorized_keys_fname,
GV.known_hosts_fname,
GV.id_rsa_fname,
GV.id_rsa_pub_fname,
canonicalize(h.host())))
h.popen(cmd)
我在这四个地方加了 -p 33822 的参数进去。保存退出。再次执行就成功了。
2.初始化GP数据库,需要用到的命令。当你执行时,也会发现跟上步骤类似问题。
gpinitsystem -c initgp_config -h seg_hosts_file
修改/usr/local/greenplum-db-6.6.0/lib/python/gppylib/commands/base.py脚本文件。在ssh 后面加-p 33822参数即可。
def execute(self, cmd):
# prepend env. variables from ExcecutionContext.propagate_env_map
# e.g. Given {'FOO': 1, 'BAR': 2}, we'll produce "FOO=1 BAR=2 ..."
self.__class__.trail.add(self.targetHost)
# also propagate env from command instance specific map
keys = sorted(cmd.propagate_env_map.keys(), reverse=True)
for k in keys:
cmd.cmdStr = "%s=%s && %s" % (k, cmd.propagate_env_map[k], cmd.cmdStr)
# Escape " for remote execution otherwise it interferes with ssh
cmd.cmdStr = cmd.cmdStr.replace('"', '\\"')
cmd.cmdStr = "ssh -p 33822 -o StrictHostKeyChecking=no -o ServerAliveInterval=60 " \
"{targethost} \"{gphome} {cmdstr}\"".format(targethost=self.targetHost,
gphome=". %s/greenplum_path.sh;" % self.gphome,
cmdstr=cmd.cmdStr)
再次执行上述命令时,GP已经开始初始化安装了。
总结:这应该也是greenplum官方的一个bug,在执行gpssh-exkeys,gpinitsystem命令时没有提供一个-p 端口的配置参数。或许就轻松多了,哎因为这个我也是折腾好久才搞定。
参考链接:
https://blog.csdn.net/u011095039/article/details/108467487
https://www.cndba.cn/Marvinn/article/3105