【openGauss】openGauss基础操作-单机扩容并安装CM组件

openGauss基础操作-单机扩容并安装CM组件

一.前置条件

首先搭建好两个单机环境用来扩容,本次操作是将单机扩容为一主一备集群,并安装CM组件来使用虚IP访问。1

10.125.9.45 gauss1
10.125.10.75 gauss2

gauss1数据库状态

[root@gauss1 ~]# su - omm -c "gs_om -t status --detail"
[   Cluster State   ]

cluster_state   : Normal
redistributing  : No
current_az      : AZ_ALL

[  Datanode State   ]

    node  node_ip         port      instance                         state
------------------------------------------------------------------------------------------
1  gauss1 10.125.9.45     15400      6001 /usr/bin/gaussdb/data/dn   P Primary Normal

gauss2数据库状态2

[root@gauss2 ~]# su - omm -c "gs_om -t status --detail"
[   Cluster State   ]

cluster_state   : Normal
redistributing  : No
current_az      : AZ_ALL

[  Datanode State   ]

    node  node_ip         port      instance                         state
------------------------------------------------------------------------------------------
1  gauss2 10.125.10.75    15400      6001 /usr/bin/gaussdb/data/dn   P Primary Normal
[root@gauss2 ~]# su - omm -c "gaussdb -V"
gaussdb (openGauss 5.0.0 build 331ae190) compiled at 2024-08-23 14:39:45 commit 0 last mr

二.扩容操作

1.创建两台机器免密

使用openGauss提供的gs_sshexkey工具来建立。3在gauss1上创建/home/omm/ip.txt,在里面写入两台虚机IP,供gs_sshexkey工具使用。

  • 为root用户创建免密,Cloud@1234为虚机密码,可以交互输入,也可以这样免交互
[root@gauss1 ~]# /opt/openGauss/script/gs_sshexkey -f /home/omm/ip.txt  <<<Cloud@1234
Checking network information.
All nodes in the network are Normal.
Successfully checked network information.
Creating SSH trust.
Creating the local key file.
Successfully created the local key files.
Appending local ID to authorized_keys.
Successfully appended local ID to authorized_keys.
Updating the known_hosts file.
Successfully updated the known_hosts file.
Appending authorized_key on the remote node.
Successfully appended authorized_key on all remote node.
Checking common authentication file content.
Successfully checked common authentication content.
Distributing SSH trust file to all node.
Distributing trust keys file to all node successfully.
Successfully distributed SSH trust file to all node.
Verifying SSH trust on all hosts.
Successfully verified SSH trust on all hosts.
Successfully created SSH trust.
  • 为omm用户创建免密,同上
[root@gauss1 .ssh]# su - omm -c "/opt/openGauss/script/gs_sshexkey -f /home/omm/ip.txt  <<<Cloud@1234"
Checking network information.
All nodes in the network are Normal.
Successfully checked network information.
Creating SSH trust.
Creating the local key file.
Successfully created the local key files.
Appending local ID to authorized_keys.
Successfully appended local ID to authorized_keys.
Updating the known_hosts file.
Successfully updated the known_hosts file.
Appending authorized_key on the remote node.
Successfully appended authorized_key on all remote node.
Checking common authentication file content.
Successfully checked common authentication content.
Distributing SSH trust file to all node.
Distributing trust keys file to all node successfully.
Successfully distributed SSH trust file to all node.
Verifying SSH trust on all hosts.
Successfully verified SSH trust on all hosts.
Successfully created SSH trust.

2.编辑集群配置xml文件4

[root@gauss1 ~]# cat /home/omm/cluster_config.xml
<?xml version="1.0" encoding="UTF-8"?>
<ROOT>
    <!-- openGauss整体信息 -->
    <CLUSTER>
        <PARAM name="clusterName" value="opengauss_cluster" />
        <PARAM name="nodeNames" value="gauss1,gauss2" />
        <PARAM name="gaussdbAppPath" value="/usr/bin/gaussdb/app" />
        <PARAM name="gaussdbLogPath" value="/var/log/vdi/gaussdb" />
        <PARAM name="tmpMppdbPath" value="/usr/bin/gaussdb/tmp"/>
        <PARAM name="gaussdbToolPath" value="/usr/bin/gaussdb/om" />
        <PARAM name="corePath" value="/usr/bin/gaussdb/corefile"/>
        <PARAM name="backIp1s" value="10.125.9.45,10.125.10.75"/>
        <PARAM name="clusterType" value="single-inst"/>
    </CLUSTER>
    <DEVICELIST>
        <!-- 节点1上的部署信息 -->
        <DEVICE sn="gauss1">
            <PARAM name="name" value="gauss1"/>
            <PARAM name="azName" value="AZ1"/>
            <PARAM name="azPriority" value="1"/>
            <!-- 节点1的IP,如果服务器只有一个网卡可用,将backIP1和sshIP1配置成同一个IP -->
            <PARAM name="backIp1" value="10.125.9.45"/>
            <PARAM name="sshIp1" value="10.125.9.45"/>
            <!--dn-->
            <PARAM name="dataNum" value="1"/>
            <PARAM name="dataPortBase" value="15400"/>
            <PARAM name="dataNode1" value="/usr/bin/gaussdb/data/dn,gauss2,/usr/bin/gaussdb/data/dn"/>
            <PARAM name="dataNode1_syncNum" value="1"/>

        </DEVICE>

        <!-- 节点2上的节点部署信息,其中“name”的值配置为主机名称 -->
        <DEVICE sn="gauss2">
            <!-- 节点2的主机名称 -->
            <PARAM name="name" value="gauss2"/>
            <!-- 节点2所在的AZ及AZ优先级 -->
            <PARAM name="azName" value="AZ1"/>
            <PARAM name="azPriority" value="1"/>
            <!-- 节点2的IP,如果服务器只有一个网卡可用,将backIP1和sshIP1配置成同一个IP -->
            <PARAM name="backIp1" value="10.125.10.75"/>
            <PARAM name="sshIp1" value="10.125.10.75"/>
        </DEVICE>
    </DEVICELIST>
</ROOT>

3.执行扩容命令

[root@gauss1 ~]# /opt/openGauss/script/gs_expansion -U omm -G dbgrp -X /home/omm/cluster_config.xml -h 10.125.10.75 -L --time-out=300
Start expansion without cluster manager component.
Database on standby nodes installed finished.

Checking gaussdb and gs_om version.
End to check gaussdb and gs_om version.

Start to establish the relationship.
Start to build standby 10.125.10.75.
Build standby 10.125.10.75 success.
Start to generate and send cluster static file.
End to generate and send cluster static file.

Expansion results:
10.125.10.75:   Success
Expansion Finish.

扩容完成,此时查看集群状态

[root@gauss1 ~]# su - omm -c "gs_om -t status --detail"
[   Cluster State   ]

cluster_state   : Normal
redistributing  : No
current_az      : AZ_ALL

[  Datanode State   ]

    node  node_ip         port      instance                         state
------------------------------------------------------------------------------------------
1  gauss1 10.125.9.45     15400      6001 /usr/bin/gaussdb/data/dn   P Primary Normal
2  gauss2 10.125.10.75    15400      6002 /usr/bin/gaussdb/data/dn   S Standby Normal

此时的集群,支持使用gs_ctl主备切换,但需要在对应机器上执行;
且不支持虚IP,故障自动切换等功能,如下停止主节点后,集群直接不可用

[root@gauss1 ~]# su - omm -c "gs_ctl stop -D /usr/bin/gaussdb/data/dn"
[2024-09-02 19:21:03.773][79008][][gs_ctl]: gs_ctl stopped ,datadir is /usr/bin/gaussdb/data/dn
waiting for server to shut down..... done
server stopped
[root@gauss1 ~]# su - omm -c "gs_om -t status --detail"
[   Cluster State   ]

cluster_state   : Unavailable
redistributing  : No
current_az      : AZ_ALL

[  Datanode State   ]

    node  node_ip         port      instance                         state
------------------------------------------------------------------------------------------
1  gauss1 10.125.9.45     15400      6001 /usr/bin/gaussdb/data/dn   P Down    Manually stopped
2  gauss2 10.125.10.75    15400      6002 /usr/bin/gaussdb/data/dn   S Standby Need repair(Disconnected)

这些功能需要额外安装CM组件来支持,接下来开始安装CM组件;

4.编辑含CM配置信息的集群配置文件

[root@gauss1 ~]# cat /home/omm/cm_config.xml
<?xml version="1.0" encoding="UTF-8"?>
<ROOT>
    <!-- openGauss整体信息 -->
    <CLUSTER>
        <PARAM name="clusterName" value="opengauss_cluster" />
        <PARAM name="nodeNames" value="gauss1,gauss2" />
        <PARAM name="gaussdbAppPath" value="/usr/bin/gaussdb/app" />
        <PARAM name="gaussdbLogPath" value="/var/log/vdi/gaussdb" />
        <PARAM name="tmpMppdbPath" value="/usr/bin/gaussdb/tmp"/>
        <PARAM name="gaussdbToolPath" value="/usr/bin/gaussdb/om" />
        <PARAM name="corePath" value="/usr/bin/gaussdb/corefile"/>
        <PARAM name="backIp1s" value="10.125.9.45,10.125.10.75"/>
        <PARAM name="clusterType" value="single-inst"/>
    </CLUSTER>
    <DEVICELIST>
        <!-- 节点1上的部署信息 -->
        <DEVICE sn="gauss1">
            <PARAM name="name" value="gauss1"/>
            <PARAM name="azName" value="AZ1"/>
            <PARAM name="azPriority" value="1"/>
            <!-- 节点1的IP,如果服务器只有一个网卡可用,将backIP1和sshIP1配置成同一个IP -->
            <PARAM name="backIp1" value="10.125.9.45"/>
            <PARAM name="sshIp1" value="10.125.9.45"/>
            <!--CM节点部署信息-->
            <PARAM name="cmsNum" value="1"/>
            <PARAM name="cmServerPortBase" value="15000"/>
            <PARAM name="cmServerListenIp1" value="10.125.9.45,10.125.10.75"/>
            <PARAM name="cmServerHaIp1" value="10.125.9.45,10.125.10.75"/>
            <!-- cmServerlevel目前只支持1 -->
            <PARAM name="cmServerlevel" value="1"/>
            <!-- cms主及所有备的hostname -->
            <PARAM name="cmServerRelation" value="gauss1,gauss2"/>
            <PARAM name="cmDir" value="/usr/bin/gaussdb/cmserver"/>
            <!--dn-->
            <PARAM name="dataNum" value="1"/>
            <PARAM name="dataPortBase" value="15400"/>
            <PARAM name="dataNode1" value="/usr/bin/gaussdb/data/dn,gauss2,/usr/bin/gaussdb/data/dn"/>
            <PARAM name="dataNode1_syncNum" value="1"/>

        </DEVICE>

        <!-- 节点2上的节点部署信息,其中“name”的值配置为主机名称 -->
        <DEVICE sn="gauss2">
            <!-- 节点2的主机名称 -->
            <PARAM name="name" value="gauss2"/>
            <!-- 节点2所在的AZ及AZ优先级 -->
            <PARAM name="azName" value="AZ1"/>
            <PARAM name="azPriority" value="1"/>
            <!-- 节点2的IP,如果服务器只有一个网卡可用,将backIP1和sshIP1配置成同一个IP -->
            <PARAM name="backIp1" value="10.125.10.75"/>
            <PARAM name="sshIp1" value="10.125.10.75"/>
            <!-- cm -->
            <PARAM name="cmServerPortStandby" value="15000"/>
            <PARAM name="cmDir" value="/usr/bin/gaussdb/cmserver"/>
        </DEVICE>
    </DEVICELIST>
</ROOT>

5.执行CM组件安装

安装后查看集群状态已有CM组件

[root@gauss1 ~]# su - omm -c "/usr/bin/gaussdb/app/tool/cm_tool/cm_install -X /home/omm/cm_config.xml --cmpkg=/opt/openGauss/openGauss-5.0.0-openEuler-64bit-cm.tar.gz"
Start to install cm tool.
Preparing CM path.
Decompressing CM pacakage.
Creating cluster_manual_start file.
Initializing cm_server.
Initializing cm_agent.
Creating CM ca files.
Refreshing static and dynamic file using xml file with cm.
Setting om_monitor crontab.
Starting cluster.
[  CMServer State   ]

node      instance state
--------------------------
1  gauss1 1        Primary
2  gauss2 2        Standby

[   Cluster State   ]

cluster_state   : Normal
redistributing  : No
balanced        : Yes
current_az      : AZ_ALL

[  Datanode State   ]

node      instance state            | node      instance state
----------------------------------------------------------------------------
1  gauss1 6001     P Primary Normal | 2  gauss2 6002     S Standby Normal
Install CM tool success.
[root@gauss1 ~]# su - omm -c "gs_om -t status --detail"
[  CMServer State   ]

node      node_ip         instance                                 state
--------------------------------------------------------------------------
1  gauss1 10.125.9.45     1    /usr/bin/gaussdb/cmserver/cm_server Primary
2  gauss2 10.125.10.75    2    /usr/bin/gaussdb/cmserver/cm_server Standby

[   Cluster State   ]

cluster_state   : Normal
redistributing  : No
balanced        : Yes
current_az      : AZ_ALL

[  Datanode State   ]

node      node_ip         instance                      state
-------------------------------------------------------------------------
1  gauss1 10.125.9.45     6001 /usr/bin/gaussdb/data/dn P Primary Normal
2  gauss2 10.125.10.75    6002 /usr/bin/gaussdb/data/dn S Standby Normal

6.设置故障自动切换

CM最低支持两节点安装,是5.0.0的新特性,CM的集群特性是依靠网络的仲裁机制,因此,必须设置一个仲裁IP来使两节点对自身状态做出判断,此外,还有两个必备参数需要设置;

#是否允许CM集群自身故障自动切换
su - omm -c "cm_ctl set --param --server -k \"cms_enable_failover_on2nodes=true\""
#是否允许数据库集群脑裂自动恢复故障
su - omm -c "cm_ctl set --param --server -k \"cms_enable_db_crash_recovery=true\""
#当前AZ中第三方网关IP地址或任何其他独立于当前集群的可用IP地址,需要确保其与集群中节点间的网络相通。
su - omm -c "cm_ctl set --param --server -k \"third_party_gateway_ip=10.125.10.95\""
使用reload使参数生效
[root@gauss1 ~]# su - omm -c "cm_ctl reload"
.
cm_ctl: cm_ctl reload success.

7.设置虚IP

分配一个当前网段未被使用的IP作为虚IP,最后使用res --check来校验自动生成的虚IP资源文件格式是否正确,会生成一个JSON文件在当前节点的CM Agent目录下;

[root@gauss1 ~]# su - omm -c "cm_ctl res --add --res_name=\"VIP_AZ1\" --res_attr=\"resources_type=VIP,float_ip=10.125.9.46\""
cm_ctl: add res(VIP_AZ1) success.
[root@gauss1 ~]# su - omm -c "cm_ctl res --edit --res_name=\"VIP_AZ1\" --add_inst=\"node_id=1,res_instance_id=6001\" --inst_attr=\"base_ip=10.125.9.45\""
cm_ctl: edit res(VIP_AZ1) success.
[root@gauss1 ~]# su - omm -c "cm_ctl res --edit --res_name=\"VIP_AZ1\" --add_inst=\"node_id=2,res_instance_id=6002\" --inst_attr=\"base_ip=10.125.10.75\""
cm_ctl: edit res(VIP_AZ1) success.
[root@gauss1 ~]# su - omm -c "cm_ctl res --check"
cm_ctl: resource config is valid.

此命令添加只会在当前节点生成虚IP文件信息,检查完成没有错误后,需要手动将该文件分发到其他节点,分发完成后需要重启集群才能生效。

scp /usr/bin/gaussdb/cmserver/cm_agent/cm_resource.json root@10.125.10.75:/usr/bin/gaussdb/cmserver/cm_agent/cm_resource.json
ssh root@${ip_address[1]} "chown omm:dbgrp /usr/bin/gaussdb/cmserver/cm_agent/cm_resource.json"

重启后查看数据库状态及虚IP

[root@gauss1 ~]# su - omm -c "cm_ctl stop&&cm_ctl start -t 60"
cm_ctl: stop cluster.
cm_ctl: stop nodeid: 1
cm_ctl: stop nodeid: 2
...........
cm_ctl: stop cluster successfully.
cm_ctl: checking cluster status.
cm_ctl: checking cluster status.
cm_ctl: checking finished in 830 ms.
cm_ctl: start cluster.
cm_ctl: start nodeid: 1
cm_ctl: start nodeid: 2
...........
cm_ctl: start cluster successfully.
[root@gauss1 ~]# su - omm -c "gs_om -t status --detail"
[  CMServer State   ]

node      node_ip         instance                                 state
--------------------------------------------------------------------------
1  gauss1 10.125.9.45     1    /usr/bin/gaussdb/cmserver/cm_server Primary
2  gauss2 10.125.10.75    2    /usr/bin/gaussdb/cmserver/cm_server Standby

[   Cluster State   ]

cluster_state   : Normal
redistributing  : No
balanced        : Yes
current_az      : AZ_ALL

[  Datanode State   ]

node      node_ip         instance                      state
-------------------------------------------------------------------------
1  gauss1 10.125.9.45     6001 /usr/bin/gaussdb/data/dn P Primary Normal
2  gauss2 10.125.10.75    6002 /usr/bin/gaussdb/data/dn S Standby Normal
[root@gauss1 ~]# su - omm -c "cm_ctl show"

[  Network Connect State  ]

Network timeout:       6s
Current CMServer time: 2024-09-02 22:10:51
Network stat('Y' means connected, otherwise 'N'):
|  \  |  Y  |
|  Y  |  \  |


[  Node Disk HB State  ]

Node disk hb timeout:    200s
Current CMServer time: 2024-09-02 22:10:52
Node disk hb stat('Y' means connected, otherwise 'N'):
|  N  |  N  |

[  FloatIp Network State  ]

node      instance base_ip     float_ip_name float_ip
---------------------------------------------------------
1  gauss1 6001     10.125.9.45 VIP_AZ1       10.125.9.46

最后,还需要在pg_hba.conf文件中以sha256方式添加虚IP,方便访问

[root@gauss1 ~]# su - omm -c "gs_guc reload -N all -I all -h \"host all all 10.125.9.46/32  sha256\""
The gs_guc run with the following arguments: [gs_guc -N all -I all -h host all all 10.125.9.46/32  sha256 reload ].
Begin to perform the total nodes: 2.
Popen count is 2, Popen success count is 2, Popen failure count is 0.
Begin to perform gs_guc for datanodes.
Command count is 2, Command success count is 2, Command failure count is 0.

Total instances: 2. Failed instances: 0.
ALL: Success to perform gs_guc!

8.创建omm用户保持免密定时任务5

crontab -u omm -l > /usr/bin/gaussdb/tmp/gauss_cron_omm
sed -i "/CheckSshAgent.py/d" /usr/bin/gaussdb/tmp/gauss_cron_omm
echo "*/1 * * * * source ~/.bashrc;python3 /usr/bin/gaussdb/om/script/local/CheckSshAgent.py >>/dev/null 2>&1 &" >> /usr/bin/gaussdb/tmp/gauss_cron_omm
crontab -u omm /usr/bin/gaussdb/tmp/gauss_cron_omm
rm -f /usr/bin/gaussdb/tmp/gauss_cron_omm

执行后,检查omm用户定时任务,有CheckSshAgent相关内容,这个问题就可以解决

[root@gauss2 ~]# crontab -u omm -l
*/1 * * * * source /etc/profile;(if [ -f ~/.profile ];then source ~/.profile;fi);source ~/.bashrc;source /home/omm/.bashrc; nohup om_monitor -L /var/log/vdi/gaussdb/omm/cm/om_monitor >>/dev/null 2>&1 &
*/1 * * * * source ~/.bashrc;python3 /usr/bin/gaussdb/om/script/local/CheckSshAgent.py >>/dev/null 2>&1 &

至此,整个扩容操作全部完成。


  1. 本次演示所有操作均以root用户操作; ↩︎

  2. 两台单机环境必须安装目录与数据库版本均保持一致,当前使用的数据库版本是openGauss 5.0.0企业版,均使用OM进行安装; ↩︎

  3. 使用此工具,需要检查是否所有集群节点都支持expect命令,如果不支持请自行下载expect工具,建议使用expect version 5.45。本环境安装单机时已经安装,所以无需考虑。 ↩︎

  4. gs_expansion工具仅支持扩容新节点,无法直接安装CM工具,所以当前配置文件不能包含CM信息。 ↩︎

  5. 5.0.0 中gs_sshexkey工具存在一个BUG,在备机重启后,omm用户之间的免密会发生丢失,需要手动在主备机上将对应的定时任务创建,保证免密一致存在↩︎

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值