以下只记录重要信息。第1,2,3部分是描述错误产生的原因。第4 部分是解决方法。
1,对gpseg1的primary 进行更名操作让其发生主备切换。
-- 对mirror进行更名操作并不会让其马上失联,gp_segment_connect_timeout中设定的默认值为10分钟,更改回去后还会接上。
[gpadmin@sdw2primary]$ mv gpseg1 gpseg11
[gpadmin@sdw2primary]$ mv gpseg11 gpseg1
--gpseg1的主备切换了,mirror记录变化。
[gpadmin@mdw ~]$ gpstate -s
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:-----------------------------------------------------
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- SegmentInfo
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- Hostname = sdw1
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- Address = sdw1
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- Datadir = /data1/mirror/gpseg1
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- Port = 50000
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- MirroringInfo
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- Currentrole = Primary
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- Preferred role = Mirror
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- Mirrorstatus = ChangeTracking
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- ChangeTracking Info
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- Changetracking data size = 100 MB
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- Status
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- PID = 10792
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- Configuration reports status as = Up
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- Database status = Up
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:-----------------------------------------------------
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- SegmentInfo
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- Hostname = sdw2
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- Address = sdw2
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- Datadir = /data1/primary/gpseg1
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- Port = 40000
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- MirroringInfo
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- Currentrole = Mirror
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- Preferred role = Primary
20150602:21:59:05:053347gpstate:mdw:gpadmin-[WARNING]:- Mirrorstatus = Out ofSync <<<<<<<<
20150602:21:59:05:053347gpstate:mdw:gpadmin-[INFO]:- Status
20150602:21:59:05:053347gpstate:mdw:gpadmin-[WARNING]:- PID = Not found <<<<<<<<
20150602:21:59:05:053347gpstate:mdw:gpadmin-[WARNING]:- Configuration reports status as = Down <<<<<<<<
20150602:21:59:05:053347gpstate:mdw:gpadmin-[WARNING]:- Segmentstatus = Down inconfiguration <<<<<<<<
20150602:21:59:05:053347gpstate:mdw:gpadmin-[WARNING]:-*****************************************************
20150602:21:59:05:053347gpstate:mdw:gpadmin-[WARNING]:-Warnings have been generated during statusprocessing
20150602:21:59:05:053347gpstate:mdw:gpadmin-[WARNING]:-Check log file or review screen output
20150602:21:59:05:053347gpstate:mdw:gpadmin-[WARNING]:-*****************************************************
2,--但是对数据库进行了几次操作后,我并不想要gpseg1中mirror的数据,我还是想让mirror和原先的primary同步。(嗯,就是这么任性!)
--进入master模式修改系统表,让mirror和primary的角色和状态对调一下。
[gpadmin@mdw ~]$ gpstop-a
[gpadmin@mdw ~]$ gpstart -m
[gpadmin@mdw ~]$ PGOPTIONS="-cgp_session_role=utility" psql
psql (8.2.15)
Type"help" for help.
testDB=# set allow_system_table_mods='dml';
SET
testDB=# select * from gp_segment_configuration;
dbid | content | role | preferred_role | mode| status | port | hostname | address |replication_port | san_mounts
------+---------+------+----------------+------+--------+-------+----------+---------+------------------+------------
2 | 0 | p | p | s | u | 40000 | sdw1 | sdw1 | 41000 |
4 | 0 | m | m | s | u | 50000 | sdw2 | sdw2 | 51000 |
3 | 1 | m | p | s | d | 40000 | sdw2 | sdw2 | 41000 |
5 | 1 | p | m | c | u | 50000 | sdw1 | sdw1 | 51000 |
1 | -1 | p | p | s | u | 5432 | mdw | mdw | |
(5 rows)
--
testDB=# update gp_segment_configuration setrole='p',mode='c',status='u' where dbid=3;
UPDATE 1
testDB=# update gp_segment_configuration setrole='m',mode='s',status='d' where dbid=5;
UPDATE 1
testDB=# select * from gp_segment_configuration;
dbid | content | role | preferred_role | mode| status | port | hostname | address |replication_port | san_mounts
------+---------+------+----------------+------+--------+-------+----------+---------+------------------+------------
2 | 0 | p | p | s | u | 40000 | sdw1 | sdw1 | 41000 |
4 | 0 | m | m | s | u | 50000 | sdw2 | sdw2 | 51000 |
3 | 1 | p | p | c | u | 40000 | sdw2 | sdw2 | 41000 |
5 | 1 | m | m | s | d | 50000 | sdw1 | sdw1 | 51000 |
1 | -1 | p | p | s | u | 5432 | mdw | mdw | |
(5 rows)
-- 关掉master模式。
[gpadmin@mdw ~]$ gpstop -M fast
-- 启动数据库,有个节点起不来。
[gpadmin@mdw ~]$ gpstart -a
20150602:22:08:00:055767gpstart:mdw:gpadmin-[INFO]:-Process results...
20150602:22:08:00:055767gpstart:mdw:gpadmin-[INFO]:-----------------------------------------------------
20150602:22:08:00:055767gpstart:mdw:gpadmin-[INFO]:- Successfulsegment starts = 3
20150602:22:08:00:055767gpstart:mdw:gpadmin-[INFO]:- Failedsegment starts = 0
20150602:22:08:00:055767gpstart:mdw:gpadmin-[WARNING]:-Skipped segment starts (segments are marked downin configuration) = 1 <<<<<<<<
3,--恢复一下,但肯定失败,因为primary在启动之前根本就没记录到任何变化,而那时mirror相对还有较多的记录。
[gpadmin@mdw ~]$ gprecoverseg
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:-Starting gprecoverseg with args:
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (GreenplumDatabase) 4.3.5.1 build 1'
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15(Greenplum Database 4.3.5.1 build 1) on x86_64-unknown-linux-gnu, compiled byGCC gcc (GCC) 4.4.2 compiled on May 14 2015 14:07:14'
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:-Checking if segments are ready
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:-Greenplum instance recovery parameters
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:----------------------------------------------------------
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:-Recovery type = Standard
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:----------------------------------------------------------
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:-Recovery 1 of 1
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:----------------------------------------------------------
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:- Synchronization mode = Incremental
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance host = sdw1
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:- Failedinstance address =sdw1
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance directory = /data1/mirror/gpseg1
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance port = 50000
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance replication port = 51000
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance host = sdw2
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance address = sdw2
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance directory = /data1/primary/gpseg1
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance port = 40000
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance replication port = 41000
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Target = in-place
20150602:22:16:40:062823gprecoverseg:mdw:gpadmin-[INFO]:----------------------------------------------------------
Continue withsegment recovery procedure Yy|Nn (default=N):
> y
20150602:22:16:46:062823gprecoverseg:mdw:gpadmin-[INFO]:-1 segment(s) to recover
20150602:22:16:46:062823gprecoverseg:mdw:gpadmin-[INFO]:-Ensuring 1 failed segment(s) are stopped
20150602:22:16:47:062823gprecoverseg:mdw:gpadmin-[INFO]:-Ensuring that shared memory is cleaned up forstopped segments
updating flat files
20150602:22:16:52:062823gprecoverseg:mdw:gpadmin-[INFO]:-Updating configuration with new mirrors
20150602:22:16:52:062823gprecoverseg:mdw:gpadmin-[INFO]:-Updating mirrors
.
20150602:22:16:53:062823gprecoverseg:mdw:gpadmin-[INFO]:-Starting mirrors
20150602:22:16:53:062823gprecoverseg:mdw:gpadmin-[INFO]:-Commencing parallel primary and mirror segmentinstance startup, please wait...
..
20150602:22:16:55:062823gprecoverseg:mdw:gpadmin-[INFO]:-Process results...
20150602:22:16:55:062823gprecoverseg:mdw:gpadmin-[INFO]:-Updating configuration to mark mirrors up
20150602:22:16:55:062823gprecoverseg:mdw:gpadmin-[INFO]:-Updating primaries
20150602:22:16:55:062823gprecoverseg:mdw:gpadmin-[INFO]:-Commencing parallel primary conversion of 1segments, please wait...
..
20150602:22:16:57:062823gprecoverseg:mdw:gpadmin-[INFO]:-Process results...
20150602:22:16:57:062823gprecoverseg:mdw:gpadmin-[WARNING]:-Failed to inform primary segment of updatedmirroring state. Segment:sdw2:/data1/primary/gpseg1:content=1:dbid=3:mode=r:status=u: REASON: Conversionfailed. stdout:"" stderr:"failure: Error: MirroringFailurefailure: Error: MirroringFailure "
20150602:22:16:57:062823gprecoverseg:mdw:gpadmin-[INFO]:-Done updating primaries
20150602:22:16:57:062823gprecoverseg:mdw:gpadmin-[INFO]:-******************************************************************
20150602:22:16:57:062823gprecoverseg:mdw:gpadmin-[INFO]:-Updating segments for resynchronization iscompleted.
20150602:22:16:57:062823gprecoverseg:mdw:gpadmin-[INFO]:-For segments updated successfully,resynchronization will continue in the background.
20150602:22:16:57:062823gprecoverseg:mdw:gpadmin-[INFO]:-
20150602:22:16:57:062823gprecoverseg:mdw:gpadmin-[INFO]:-Use gpstate -s to check theresynchronization progress.
20150602:22:16:57:062823gprecoverseg:mdw:gpadmin-[INFO]:-******************************************************************
4,--全量恢复,直接全部copy,才不管有什么变化。
[gpadmin@mdw ~]$ gprecoverseg -F
20150602:22:19:42:065360gprecoverseg:mdw:gpadmin-[INFO]:-Starting gprecoverseg with args: -F
20150602:22:19:42:065360gprecoverseg:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (GreenplumDatabase) 4.3.5.1 build 1'
20150602:22:19:42:065360gprecoverseg:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15(Greenplum Database 4.3.5.1 build 1) on x86_64-unknown-linux-gnu, compiled byGCC gcc (GCC) 4.4.2 compiled on May 14 2015 14:07:14'
20150602:22:19:42:065360gprecoverseg:mdw:gpadmin-[INFO]:-Checking if segments are ready
20150602:22:19:42:065360gprecoverseg:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:-Greenplum instance recovery parameters
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:----------------------------------------------------------
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:-Recovery type = Standard
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:----------------------------------------------------------
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:-Recovery 1 of 1
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:----------------------------------------------------------
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:- Synchronization mode = Full
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance host = sdw1
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance address = sdw1
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance directory = /data1/mirror/gpseg1
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance port = 50000
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:- Failed instance replication port = 51000
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance host = sdw2
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance address = sdw2
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance directory = /data1/primary/gpseg1
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance port = 40000
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Source instance replication port = 41000
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:- Recovery Target = in-place
20150602:22:19:43:065360gprecoverseg:mdw:gpadmin-[INFO]:----------------------------------------------------------
Continue withsegment recovery procedure Yy|Nn (default=N):
> y
20150602:22:19:45:065360gprecoverseg:mdw:gpadmin-[INFO]:-1 segment(s) to recover
20150602:22:19:45:065360gprecoverseg:mdw:gpadmin-[INFO]:-Ensuring 1 failed segment(s) are stopped
20150602:22:19:45:065360gprecoverseg:mdw:gpadmin-[INFO]:-14885: /data1/mirror/gpseg1
20150602:22:19:47:065360gprecoverseg:mdw:gpadmin-[INFO]:-Ensuring that shared memory is cleaned up forstopped segments
20150602:22:19:53:065360gprecoverseg:mdw:gpadmin-[INFO]:-Cleaning files from 1 segment(s)
.
20150602:22:19:54:065360gprecoverseg:mdw:gpadmin-[INFO]:-Building template directory
20150602:22:19:54:065360gprecoverseg:mdw:gpadmin-[INFO]:-Validating remote directories
.
20150602:22:19:55:065360gprecoverseg:mdw:gpadmin-[INFO]:-Copying template directory file
.
20150602:22:19:56:065360gprecoverseg:mdw:gpadmin-[INFO]:-Configuring new segments
.
20150602:22:19:57:065360gprecoverseg:mdw:gpadmin-[INFO]:-Cleaning files
.
20150602:22:19:58:065360gprecoverseg:mdw:gpadmin-[INFO]:-Starting file move procedure forsdw1:/data1/mirror/gpseg1:content=1:dbid=5:mode=r:status=d
updating flat files
20150602:22:19:58:065360gprecoverseg:mdw:gpadmin-[INFO]:-Updating configuration with new mirrors
20150602:22:19:58:065360gprecoverseg:mdw:gpadmin-[INFO]:-Updating mirrors
.
20150602:22:19:59:065360gprecoverseg:mdw:gpadmin-[INFO]:-Starting mirrors
20150602:22:19:59:065360gprecoverseg:mdw:gpadmin-[INFO]:-Commencing parallel primary and mirror segmentinstance startup, please wait.....
20150602:22:20:01:065360gprecoverseg:mdw:gpadmin-[INFO]:-Process results...
20150602:22:20:01:065360gprecoverseg:mdw:gpadmin-[INFO]:-Updating configuration to mark mirrors up
20150602:22:20:01:065360gprecoverseg:mdw:gpadmin-[INFO]:-Updating primaries
20150602:22:20:01:065360gprecoverseg:mdw:gpadmin-[INFO]:-Commencing parallel primary conversion of 1segments, please wait........
20150602:22:20:06:065360gprecoverseg:mdw:gpadmin-[INFO]:-Process results...
20150602:22:20:06:065360gprecoverseg:mdw:gpadmin-[INFO]:-Done updating primaries
20150602:22:20:06:065360gprecoverseg:mdw:gpadmin-[INFO]:-******************************************************************
20150602:22:20:06:065360gprecoverseg:mdw:gpadmin-[INFO]:-Updating segments for resynchronization iscompleted.
20150602:22:20:06:065360gprecoverseg:mdw:gpadmin-[INFO]:-For segments updated successfully,resynchronization will continue in the background.
20150602:22:20:06:065360gprecoverseg:mdw:gpadmin-[INFO]:-
20150602:22:20:06:065360gprecoverseg:mdw:gpadmin-[INFO]:-Use gpstate -s to check theresynchronization progress.
20150602:22:20:06:065360gprecoverseg:mdw:gpadmin-[INFO]:-******************************************************************
[gpadmin@mdw ~]$gpstate -e
20150602:22:22:00:067423gpstate:mdw:gpadmin-[INFO]:-Starting gpstate with args: -e
20150602:22:22:00:067423gpstate:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (GreenplumDatabase) 4.3.5.1 build 1'
20150602:22:22:00:067423gpstate:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15(Greenplum Database 4.3.5.1 build 1) on x86_64-unknown-linux-gnu, compiled byGCC gcc (GCC) 4.4.2 compiled on May 14 2015 14:07:14'
20150602:22:22:00:067423gpstate:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20150602:22:22:00:067423gpstate:mdw:gpadmin-[INFO]:-Gathering data from segments...
.
20150602:22:22:01:067423gpstate:mdw:gpadmin-[INFO]:-----------------------------------------------------
20150602:22:22:01:067423gpstate:mdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20150602:22:22:01:067423gpstate:mdw:gpadmin-[INFO]:-----------------------------------------------------
20150602:22:22:01:067423gpstate:mdw:gpadmin-[INFO]:-Segment Pairs in Resynchronization
20150602:22:22:01:067423gpstate:mdw:gpadmin-[INFO]:- CurrentPrimary Port Resync mode Est. resync progress Total resync objects Objects to resync Data synced Est. total to sync Est. resyncend time Change tracking size Mirror Port
20150602:22:22:01:067423gpstate:mdw:gpadmin-[INFO]:- sdw2 40000 Full 99.50% 2321 0 933 MB 938 MB 2015-06-02 22:22:00 150 MB sdw1 50000
--恢复时系统表的状态
testDB=# select *from gp_segment_configuration;
dbid | content | role | preferred_role | mode| status | port | hostname | address |replication_port | san_mounts
------+---------+------+----------------+------+--------+-------+----------+---------+------------------+------------
2 | 0 | p | p | s | u | 40000 | sdw1 | sdw1 | 41000 |
4 | 0 | m | m | s | u | 50000 | sdw2 | sdw2 | 51000 |
3 | 1 | p | p | r | u | 40000 | sdw2 | sdw2 | 41000 |
5 | 1 | m | m | r | u | 50000 | sdw1 | sdw1 | 51000 |
1 | -1 | p | p | s | u | 5432 | mdw | mdw | |
(5 rows)
--恢复成功
[gpadmin@mdw ~]$gpstate -e
20150602:22:23:51:068918gpstate:mdw:gpadmin-[INFO]:-----------------------------------------------------
20150602:22:23:51:068918gpstate:mdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20150602:22:23:51:068918gpstate:mdw:gpadmin-[INFO]:-----------------------------------------------------
20150602:22:23:51:068918gpstate:mdw:gpadmin-[INFO]:-All segments are running normally
--恢复成功后系统表的状态
testDB=# select *from gp_segment_configuration;
dbid | content | role | preferred_role | mode| status | port | hostname | address |replication_port | san_mounts
------+---------+------+----------------+------+--------+-------+----------+---------+------------------+------------
2 | 0 | p | p | s | u | 40000 | sdw1 | sdw1 | 41000 |
4 | 0 | m | m | s | u | 50000 | sdw2 | sdw2 | 51000 |
3 | 1 | p | p | s | u | 40000 | sdw2 | sdw2 | 41000 |
5 | 1 | m | m | s | u | 50000 | sdw1 | sdw1 | 51000 |
1 | -1 | p | p | s | u | 5432 | mdw | mdw | |
--转载请注明出处blog.csdn.net/aabc012