基于流复制的方式,两节点自动切换:
1、单pgpool
a.环境:
pgpool:192.168.238.129
data1:192.168.238.130
data2:192.168.238.131
b.图例
c.配置互信
ssh-copy-id ha@node1
ssh-copy-id ha@node2
d.数据库节点配置,请参照《
使用pg_basebackup搭建PostgreSQL流复制环境
》。
e.pgpool配置:
listen_addresses = '*'
backend_hostname0 = 'node1'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/home/ha/pgdb/data'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = 'node2'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/home/ha/pgdb/data'
backend_flag1 = 'ALLOW_TO_FAILOVER'
enable_pool_hba = on
pool_passwd = 'pool_passwd'
pid_file_name = '/home/ha/pgpool/pgpool.pid'
logdir = '/home/ha/pgpool/log'
health_check_period = 1
health_check_user = 'ha'
health_check_password = 'ha'
failover_command = '/home/ha/pgdb/fail.sh %H'
recovery_user = 'ha'
recovery_password = 'ha'
f.fail.sh
# Failover command for streaming replication.
# This script assumes that DB node 0 is primary, and 1 is standby.
#
# If standby goes down, do nothing. If primary goes down, create a
# trigger file so that standby takes over primary node.
#
# Arguments: $1: failed node id. $2: new master hostname. $3: path to
# trigger file.
new_master=$1
trigger_command="/home/ha/pgdb/bin/pg_ctl -D /home/ha/pgdb/data promote -m fast"
# Do nothing if standby goes down.
if [ $failed_node = 1 ]; then
exit 0;
fi
# Create the trigger file.
/usr/bin/ssh -T $new_master $trigger_command
exit 0;
g.建立pool_passwd
pg_md5 -m -p -u postgres pool_passwd
PS:在9.1之前一直用的是trigger_file,这里建议用promote -m fast的方式,因为
pg_ctl promote -m fast will skip the checkpoint at end of recovery so that we can achieve very fast failover when the apply delay is low. Write new WAL record XLOG_END_OF_RECOVERY to allow us to switch timeline correctly for downstream log readers. If we skip synchronous end of recovery checkpoint we request a normal spread checkpoint so that the window of re-recovery is low. Simon Riggs and Kyotaro Horiguchi, with input from Fujii Masao. Review by Heikki Linnakangash.测试
pgpool节点
[ha@node0 pgdb]$ pgpool -n -d > /tmp/pgpool.log 2>&1 &
[1] 22928
[ha@node0 pgdb]$ psql -h 192.168.238.129 -p 9999 -d postgres -U ha
Password for user ha:
psql (9.4.5)
Type "help" for help.
postgres=# insert into test values (8);
INSERT 0 1
postgres=# select * from test ;
id
----
1
2
3
4
6
8
(6 rows)
node1节点:
[ha@localhost pgdb]$ ps -ef | grep post
root 2124 1 0 Dec26 ? 00:00:00 /usr/libexec/postfix/master
postfix 2147 2124 0 Dec26 ? 00:00:00 qmgr -l -t fifo -u
postfix 13295 2124 0 06:01 ? 00:00:00 pickup -l -t fifo -u
ha 13395 1 0 06:06 pts/3 00:00:00 /home/ha/pgdb/bin/postgres
ha 13397 13395 0 06:06 ? 00:00:00 postgres: checkpointer process
ha 13398 13395 0 06:06 ? 00:00:00 postgres: writer process
ha 13399 13395 0 06:06 ? 00:00:00 postgres: wal writer process
ha 13400 13395 0 06:06 ? 00:00:00 postgres: autovacuum launcher process
ha 13401 13395 0 06:06 ? 00:00:00 postgres: stats collector process
ha 13404 13395 0 06:07 ? 00:00:00 postgres: wal sender process rep 192.168.238.131(59415) streaming 0/21000060
ha 13418 4087 0 06:07 pts/3 00:00:00 grep post
[ha@localhost pgdb]$ kill -9 13395
pgpool节点:
postgres=# insert into test values (8);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
postgres=# insert into test values (8);
INSERT 0 1
postgres=# insert into test values (8);
INSERT 0 1
postgres=# select * from test ;
id
----
1
2
3
4
6
8
8
8
(8 rows)
2.两个pgpool节点
a.环境
pgpool:192.168.238.129 pgpool:192.168.238.131
node1:192.168.238.130
node2:192.168.238.131
b.图例
c.配置互信,同上。
d.数据库节点配置,同上。
e.pgpool配置
node1
listen_addresses = '*'
backend_hostname0 = 'node1'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/home/ha/pgdb/data/'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = 'node2'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/home/ha/pgdb/data/'
backend_flag1 = 'ALLOW_TO_FAILOVER'
enable_pool_hba = on
authentication
pool_passwd = 'pool_passwd'
pid_file_name = '/home/ha/pgpool/pgpool.pid'
logdir = '/tmp/log'
master_slave_mode = on
master_slave_sub_mode = 'stream'
sr_check_period =2
sr_check_user = 'ha'
sr_check_password = 'ha'
health_check_period = 1
health_check_timeout = 20
health_check_user = 'ha'
health_check_password = 'ha'
failover_command = '/home/ha/pgpool/fail.sh %H'
recovery_user = 'ha'
recovery_password = 'ha'
use_watchdog = on
wd_hostname = 'node1' #本端
delegate_IP = '192.168.238.151'
#利用ifconfig,查看网卡
if_up_cmd = 'ifconfig eth1:0 inet $_IP_$ netmask 255.255.255.0'
if_down_cmd = 'ifconfig eth1:0 down'
heartbeat_destination0 = 'node2' #对端
heartbeat_device0 = 'eth0'
other_pgpool_hostname0 = 'node2' #对端
other_pgpool_port0 =9999
other_wd_port0 = 9000
g.配置pgpool(从)
listen_addresses = '*'
backend_hostname0 = 'node1'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/home/ha/pgdb/data/'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = 'node2'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/home/ha/pgdb/data/'
backend_flag1 = 'ALLOW_TO_FAILOVER'
enable_pool_hba = on
authentication
pool_passwd = 'pool_passwd'
pid_file_name = '/home/ha/pgpool/pgpool.pid'
logdir = '/tmp/log'
master_slave_mode = on
master_slave_sub_mode = 'stream'
sr_check_period =2
sr_check_user = 'ha'
sr_check_password = 'ha'
health_check_period = 1
health_check_timeout = 20
health_check_user = 'ha'
health_check_password = 'ha'
failover_command = '/home/ha/pgpool/fail.sh %H'
recovery_user = 'ha'
recovery_password = 'ha'
use_watchdog = on
wd_hostname = 'node2' #本端
delegate_IP = '192.168.238.151'
#利用ifconfig,查看网卡
if_up_cmd = 'ifconfig eth1:0 inet $_IP_$ netmask 255.255.255.0'
if_down_cmd = 'ifconfig eth1:0 down'
heartbeat_destination0 = 'node1' #对端
heartbeat_device0 = 'eth1'
other_pgpool_hostname0 = 'node1' #对端
other_pgpool_port0 =9999
other_wd_port0 = 9000
h.fail.sh
# Failover command for streaming replication.
# This script assumes that DB node 0 is primary, and 1 is standby.
#
# If standby goes down, do nothing. If primary goes down, create a
# trigger file so that standby takes over primary node.
#
# Arguments: $1: failed node id. $2: new master hostname. $3: path to
# trigger file.
new_master=$1
trigger_command="/home/ha/pgdb/bin/pg_ctl -D /home/ha/data start"
# Do nothing if standby goes down.
if [ $failed_node = 1 ]; then
exit 0;
fi
# Create the trigger file.
/usr/bin/ssh -T $new_master $trigger_command
exit 0;
i.建立pool_passwd
pg_md5 -m -p -u postgres pool_passwd
j.测试
#数据库、pgpool启动
[ha@node0 pgdb]$ psql -h 192.168.238.151 -p 9999 -d postgres -U ha
Password for user ha:
psql (9.4.5)
Type "help" for help.
postgres=# insert into test values (9);
INSERT 0 1
postgres=# insert into test values (9);
INSERT 0 1
postgres=#
--杀掉node1的数据库进程
postgres=# insert into test values (9);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
postgres=# insert into test values (9);
INSERT 0 1
postgres=# insert into test values (9);
INSERT 0 1
postgres=# insert into test values (9);
INSERT 0 1
postgres=# insert into test values (9);
INSERT 0 1
postgres=# insert into test values (9);
INSERT 0 1
postgres=# insert into test values (9);
INSERT 0 1
postgres=# insert into test values (9);
INSERT 0 1
postgres=# insert into test values (9);
INSERT 0 1
--杀掉node1的pgpool进程
postgres=# insert into test values (9);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
postgres=# insert into test values (9);
INSERT 0 1
postgres=# insert into test values (9);
INSERT 0 1
postgres=#