os: centos 6.8
db: postgresql 9.6
ecox: 5.12.4
ecox 高可用方案由三个部分组成:分布式应用一致性服务软件仲裁集群、ecox 高可用集群管理软件、postgresql 数据库/集群:
仲裁集群(zookeeper):主要作用是观察者,判断PG集群是否有故障;采用集群是为了避免观察者本身的单点故障。
postgresql 数据库集群:PG本身的概念,不多做介绍。
ecox 集群:主要保护PG集群的高可用和节点角色切换,异步模式尽可能保证无数据丢失(同步复制不存在此情况)。
测试环境为节省资源,三个节点,每个节点上三种软件都安装.
ip 规划如下
192.168.56.100 nodevip
192.168.56.101 node1
192.168.56.102 node2
192.168.56.103 node3
安装 zookeeper
node1,node2,node3 均需安装 zookeeper 集群模式
以 node1 为例,记录下配置过程.node2,node3 参考 node1.
配置java,参考<<centos 7 默认 Java 为 OpenJDK ,如何替换为 Oracle JDK>>
# lsb_release -a
LSB Version: :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
Distributor ID: CentOS
Description: CentOS release 6.8 (Final)
Release: 6.8
Codename: Final
# java -version
java version "1.7.0_99"
OpenJDK Runtime Environment (rhel-2.6.5.1.el6-x86_64 u99-b00)
OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)
# cd /opt
# rz
# rpm -Uvh ./jdk-8u211-linux-x64.rpm
# /usr/sbin/update-alternatives --install /usr/bin/java java /usr/java/jdk1.8.0_211-amd64/bin/java 180211;
# /usr/sbin/update-alternatives --config java;
# java -version
java version "1.8.0_211"
Java(TM) SE Runtime Environment (build 1.8.0_211-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)
配置 zookeeper
# cd /usr
# wget http://mirror.bit.edu.cn/apache/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz
# tar -zxvf zookeeper-3.4.14.tar.gz
# cd zookeeper-3.4.14
# cp ./conf/zoo_sample.cfg ./conf/zoo.cfg
# egrep ^[a-z] ./conf/zoo.cfg
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/tmp/zookeeper
clientPort=2181
server.1=192.168.56.101:2888:3888
server.2=192.168.56.102:2888:3888
server.3=192.168.56.103:2888:3888
# mkdir -p /tmp/zookeeper;
echo "1" >> /tmp/zookeeper/myid;
# ./bin/zkServer.sh start ./conf/zoo.cfg
# ./bin/zkServer.sh status ./conf/zoo.cfg
ZooKeeper JMX enabled by default
Using config: ./conf/zoo.cfg
Mode: leader
# ./bin/zkCli.sh
安装 postgresql
node1,node2,node3 均只安装数据库软件,不使用 initdb 初始化数据库集群.
参考 https://www.postgresql.org/download/linux/redhat/
# yum -y install https://download.postgresql.org/pub/repos/yum/reporpms/EL-6-x86_64/pgdg-redhat-repo-latest.noarch.rpm
# yum -y install postgresql96*
Installed:
postgresql96.x86_64 0:9.6.14-1PGDG.rhel6 postgresql96-contrib.x86_64 0:9.6.14-1PGDG.rhel6
postgresql96-debuginfo.x86_64 0:9.6.14-1PGDG.rhel6 postgresql96-devel.x86_64 0:9.6.14-1PGDG.rhel6
postgresql96-docs.x86_64 0:9.6.14-1PGDG.rhel6 postgresql96-libs.x86_64 0:9.6.14-1PGDG.rhel6
postgresql96-odbc.x86_64 0:11.00.0000-1PGDG.rhel6 postgresql96-plperl.x86_64 0:9.6.14-1PGDG.rhel6
postgresql96-plpython.x86_64 0:9.6.14-1PGDG.rhel6 postgresql96-pltcl.x86_64 0:9.6.14-1PGDG.rhel6
postgresql96-server.x86_64 0:9.6.14-1PGDG.rhel6 postgresql96-tcl.x86_64 0:2.4.0-1.rhel6
postgresql96-tcl-debuginfo.x86_64 0:2.3.1-1.rhel6 postgresql96-test.x86_64 0:9.6.14-1PGDG.rhel6
Complete!
安装 ecox
node1,node2,node3 均需安装 ecox 软件
# cd /opt
# rz
# tar -zxvf ./EcoX-5.12.4-1.el6.x86_64-CentOS.tar.gz
# rpm -Uvh ./EcoX-5.12.4-1.el6.x86_64.rpm
# rpm -ql EcoX-4.10.1-1.el6.x86_64
/usr/EcoX/bin/EcoX
/usr/EcoX/bin/ecox
/usr/EcoX/bin/pg_rewind94
/usr/EcoX/etc/ecox.conf
/usr/EcoX/etc/log4crc
/usr/EcoX/lib/liblog4c.so
/usr/EcoX/lib/liblog4c.so.3
/usr/EcoX/lib/liblog4c.so.3.3.1
/usr/EcoX/lib/libzookeeper_mt.so
/usr/EcoX/lib/libzookeeper_mt.so.2
/usr/EcoX/lib/libzookeeper_mt.so.2.0.0
/usr/EcoX/share/EcoX.pid
/usr/EcoX/share/node.rd
# vi /etc/profile
export PATH=/usr/EcoX/bin:$PATH
export LD_LIBRARY_PATH=/usr/EcoX/lib:$LD_LIBRARY_PATH
# export PATH=/usr/EcoX/bin:$PATH;
export LD_LIBRARY_PATH=/usr/EcoX/lib:$LD_LIBRARY_PATH;
# which ecox
/usr/EcoX/bin/ecox
# sudo echo "postgres ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
修改ecox高可用软件配置文件
# cp /usr/EcoX/etc/ecox.conf /usr/EcoX/etc/ecox.conf.bak
# vi /usr/EcoX/etc/ecox.conf
#-------------------------------------------------------------------------------
root=/EcoX
cluster=pg9.6
zkhost=192.168.56.101:2181,192.168.56.102:2181,192.168.56.103:2181
zktimeout=20000
pgcomaster_num=2
auto_promote_slave=true
#-------------------------------------------------------------------------------
role=auto
async_replication=true
#-------------------------------------------------------------------------------
pghost=192.168.56.101
pgbin=/usr/pgsql-9.6/bin
pgport=5432
pgdata=/var/lib/pgsql/9.6/data
pgclient_net=192.168.56.0/24
pgcluster_net=192.168.56.0/24
#-------------------------------------------------------------------------------
do_not_use_vip=false
is_vip_use_real_interface=false
eth_port=eth1
virtual_ip=192.168.56.100
netmask=255.255.255.0
do_not_use_comaster_vip=true
comaster_eth_port=eth1
comaster_virtual_ip=192.168.56.99
comaster_netmask=255.255.255.0
启动 ecox
# su - postgres
$ which ecox
/usr/EcoX/bin/ecox
$ ecox --help
EcoX is PostgreSQL cluster management software
Usage:
EcoX [OPTION]...
Options:
--help/-h show help message
--version show current EcoX version
--start start current EcoX postgreSQL node
--stop stop current node
--show node show current node's status
--show cluster show current node's status
--clean/-c clear all the zookeeper data of the cluster which this node in
--free free the current lock on cluster
--rewind execute pg_rewind, resync old master from the current master
--restart database restart postgreSQL without change node role
--status check the master node is ready, if so the cluster can be use
--news get current new feature
Copyright reserved, Chengdu WenWu Information Technology Co.,Ltd.
support: db_support@ww-it.cn | web: w3.ww-it.cn
初始化数据库集群 master 节点,第一次运行会调用 initdb
node1 节点运行
$ ecox start
$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:d4:35:fc brd ff:ff:ff:ff:ff:ff
inet 10.0.2.19/24 brd 10.0.2.255 scope global eth0
inet6 fe80::a00:27ff:fed4:35fc/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:46:e3:1b brd ff:ff:ff:ff:ff:ff
inet 192.168.56.101/24 brd 192.168.56.255 scope global eth1
inet 192.168.56.100/24 brd 192.168.56.255 scope global secondary eth1:101
inet6 fe80::a00:27ff:fe46:e31b/64 scope link
valid_lft forever preferred_lft forever
$ psql
psql (9.6.14)
Type "help" for help.
postgres=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+-------------+-------------+-----------------------
postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
(3 rows)
postgres=# \du
List of roles
Role name | Attributes | Member of
-----------------+------------------------------------------------------------+-----------
postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
user_for_rewind | Superuser, Create role, Create DB +| {}
| 8 connections |
node2 节点运行
# su - postgres
$ ecox start
mkdir -p /var/lib/pgsql/EcoX/log
Initial log system success, priority is 600
PostgreSQL 9.3 is compatible
Begin wait for master online ...
Begin wait for node0000000000 online ...
NOTICE: WAL archiving is not enabled; you must ensure that all required WAL segments are copied through other means to complete the backup
Waiting for database starting ....< 2019-07-10 17:11:41.909 CST > LOG: 00000: redirecting log output to logging collector process
< 2019-07-10 17:11:41.909 CST > HINT: Future log output will appear in directory "pg_log".
< 2019-07-10 17:11:41.909 CST > LOCATION: SysLogger_Start, syslogger.c:654
..done
database start success
node3 节点运行
# su - postgres
$ ecox start
mkdir -p /var/lib/pgsql/EcoX/log
Initial log system success, priority is 600
PostgreSQL 9.3 is compatible
Begin wait for master online ...
Begin wait for node0000000000 online ...
NOTICE: WAL archiving is not enabled; you must ensure that all required WAL segments are copied through other means to complete the backup
Waiting for database starting ....< 2019-07-10 17:12:19.708 CST > LOG: 00000: redirecting log output to logging collector process
< 2019-07-10 17:12:19.708 CST > HINT: Future log output will appear in directory "pg_log".
< 2019-07-10 17:12:19.708 CST > LOCATION: SysLogger_Start, syslogger.c:654
..done
database start success
node1 上查询 pg_stat_replication
postgres=# select * from pg_stat_replication;
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent_location | write_location | flush_location | replay_location | sync_priority | sync_state
-------+----------+----------+-------------------+----------------+-----------------+-------------+-------------------------------+--------------+-----------+---------------+----------------+----------------+-----------------+---------------+------------
10081 | 10 | postgres | ecox_standby_node | 192.168.56.102 | node2 | 48656 | 2019-07-10 17:11:42.026399+08 | | streaming | 0/5000060 | 0/5000060 | 0/5000060 | 0/5000060 | 0 | async
10290 | 10 | postgres | ecox_standby_node | 192.168.56.103 | node3 | 56676 | 2019-07-10 17:12:20.795262+08 | | streaming | 0/5000060 | 0/5000060 | 0/5000060 | 0/5000060 | 0 | async
(2 rows)
符合预期
参考:
http://w3.ww-it.cn/Webpage/product.html
http://w3.ww-it.cn/Webpage/scheme
http://postgres.cn/v2/news/viewone/1/470
http://postgres.cn/v2/news/viewone/1/476