当需要移除greenplum集群中的某些主机,并将其上的数据均匀地恢复到其他主机上时, 对这些主机单独备份,并进行串行恢复就是很好的选择。
本文探讨性地介绍了对单个节点备份(指定多个节点同时备份也有介绍),并进行串行恢复的具体操作。
1,首先看一下gp_dump都有哪些选项
[gpadmin@mdw~]$ gp_dump --help
gp_dumpdumps a database as a text file or to other formats.
--gp_dump 以text文件或其他格式备份数据库
Usage:
gp_dump [OPTION]... [DBNAME]
--使用格式
Generaloptions:
--总体选项
-i, --ignore-version proceed even when server versionmismatches gp_dump version
--忽略主机版本和gp_dump版本的差异进行
-v, --verbose verbose mode. adds verbose informationto the per segment status files
--冗长模式,将详细的信息加入到每个segment的状态文件
--help show this help, then exit
--显示帮助,然后退出
--version output version information, then exit
--显示版本信息,然后退出
Optionscontrolling the output content:
--输出内容控制选项
-a, --data-only dump only the data, not the schema
--只导出数据,不导出模式
-c, --clean clean (drop) schema prior tocreate
--导出的数据中包括删除模式语句
-d, --inserts dump data as INSERT, rather thanCOPY, commands
--导出的数据格式为insert语句,原本为copy命令导出的纯文本数据
-D, --column-inserts dump data as INSERT commands with columnnames
--在以insert导出的语句中包括字段名
-E, --encoding=ENCODING dump the data in encoding ENCODING
--导出的数据编码
-n, --schema=SCHEMA dump the named schema only
--导出指定的schema
-N, --exclude-schema=SCHEMA do NOT dump the named schema(s)
--导出时排除指定的schema
-o, --oids include OIDs in dump
--将OID也导出
-O, --no-owner do not output commands to set object ownershipin plain text format
--不导出owner语句
-s, --schema-only dump only the schema, no data
--只导出schema(数据定义),不导出数据
-S, --superuser=NAME specify the superuser user name to use in plaintext format
--指定superuser
-t, --table=TABLE dump only matching table(s) (or viewsor sequences)
--只导出指定的表
-T, --exclude-table=TABLE do NOT dump matching table(s) (or views orsequences)
--不导出指定的表
-x, --no-privileges do not dump privileges (grant/revoke)
--不导出权限信息
--disable-triggers disable triggers during data-onlyrestore
--当只还原数据时触发器不可用,gp中触发器本来就用不了
--incremental dump only modified patitions
--patitions这个单词可能错了,只导出修改过的分区
--use-set-session-authorization use SESSION AUTHORIZATION commands insteadof
ALTER OWNERcommands to set ownership
--使用SESSIONAUTHORIZATION命令代替ALTER OWNER命令来设定所有权
Connectionoptions:
--联接选项
-h, --host=HOSTNAME database server host or socket directory
--master主机名
-p, --port=PORT database server port number
--master 端口
-U, --username=NAME connect as specified database user
--用户名
-W, --password force password prompt (should happenautomatically)
--强制使用密码
GreenplumDatabase specific options:
-- GreenplumDatabase特殊选项
--gp-c use gzip for in-linecompression
--使用gzip压缩
--gp-d=BACKUPFILEDIR directory where backup files are placed
--指定导出的数据目录
--gp-k=TIMESTAMP timestamp key to be used for the dumpfiles
--导出的文件使用时间戳
--gp-r=REPORTFILEDIR directory where report file is placed
--指定report目录
--gp-s=BACKUPSET backup set indicator - (p)rimaries only(default)
or (i)ndividual segdb(must be followed with a list of dbids
of primary segmentsto dump. For example: --gp-s=i[10,12,14]
--备份单个节点上的数据在于该命令,备份指示设定,(p)是默认导出所有primary上的数据,
(i)可以在后面加入一系列的dbid来指定导出哪几个primary上的数据,如:--gp-s=i[10,12,14]
--rsyncable pass --rsyncable option to gzip
--远程同步
If nodatabase name is supplied, then the PGDATABASE environment variable value isused.
--如果没有指定数据库名,将使用PGDATABASE环境变量值,就是默认数据库。
2,--查看节点配置信息,准备只备份sdw2这个主机上的数据,其上的primary的dbid为3
gpdumptest =#select * from gp_segment_configuration;
dbid | content | role | preferred_role | mode| status | port | hostname | address |replication_port | san_mounts
------+---------+------+----------------+------+--------+-------+----------+---------+------------------+------------
1 | -1 | p | p | s | u | 5432 | mdw | mdw | |
2 | 0 | p | p | s |u | 40000 | sdw1 | sdw1 | 41000 |
4 | 0 | m | m | s | u | 50000 | sdw2 | sdw2 | 51000 |
3 | 1 | p | p | s | u | 40000 | sdw2 | sdw2 | 41000 |
5 | 1 | m | m | s | u | 50000 | sdw1 | sdw1 | 51000 |
(5 rows)
gpdumptest=#\d
List of relations
Schema | Name | Type | Owner | Storage
--------+------+-------+---------+---------
public | test | table | gpadmin | heap
(1 row)
-- gpdumptest该数据库上只有一个关系。
gpdumptest=#select * from test;
id | id2
----+-----
2 | 3
4 | 5
6 | 7
8 | 9
10 | 11
1 | 2
3 | 4
5 | 6
7 | 8
9 | 10
(10 rows)
--10行数据
3,
--备份dbid=3上的数据到sdw2主机的/home/gpadmin/gp_dump/下,报告是位于master主机/home/gpadmin/gp_dump/下。
--存在多个数据库时,要多次备份并进行恢复。
--由记录可看出只导出了dbid=3上相关数据库上的数据。
[gpadmin@mdw~]$ gp_dump --gp-s=i[3] --gp-d=/home/gpadmin/gp_dump/ --gp-r=/home/gpadmin/gp_dump/ gpdumptest
20150612:21:59:08|gp_dump-[INFO]:-Readparams: <empty>
20150612:21:59:08|gp_dump-[INFO]:-Commandline options analyzed.
20150612:21:59:08|gp_dump-[INFO]:-Connectingto master database on host localhost port 5432 database gpdumptest.
20150612:21:59:08|gp_dump-[INFO]:-ReadingGreenplum Database configuration info from master database.
20150612:21:59:08|gp_dump-[INFO]:-Preparingto dump the following segments:
20150612:21:59:08|gp_dump-[INFO]:-Segment1 (dbid 3)
20150612:21:59:08|gp_dump-[INFO]:-Startinga transaction on master database gpdumptest.
20150612:21:59:08|gp_dump-[INFO]:-Gettinga lock on pg_class in database gpdumptest.
20150612:21:59:08|GetTimestampKey-[INFO]:-Timestampkey is generated as it is not provided by the user.
20150612:21:59:08|gp_dump-[INFO]:-Aboutto spin off 1 threads with timestamp key 20150612215908
20150612:21:59:08|gp_dump-[INFO]:-Creatingthread to backup dbid 3: host sdw2 port 40000 database gpdumptest
20150612:21:59:08|gp_dump-[INFO]:-Waitingfor remote gp_dump_agent processes to start transactions in serializableisolation level
20150612:21:59:08|gp_dump-[INFO]:-Listeningfor messages from server on dbid 3 connection
20150612:21:59:08|gp_dump-[INFO]:-Successfullylaunched Greenplum Database backup on dbid 3 server
20150612:21:59:10|gp_dump-[INFO]:-Allremote gp_dump_agent processes have began transactions in serializableisolation level
20150612:21:59:10|gp_dump-[INFO]:-Waitingfor remote gp_dump_agent processes to obtain local locks on dumpable objects
20150612:21:59:10|gp_dump-[INFO]:-Allremote gp_dump_agent processes have obtains the necessary locks
20150612:21:59:10|gp_dump-[INFO]:-Committingtransaction on the master database, thereby releasing locks.
20150612:21:59:10|gp_dump-[INFO]:-Waitingfor all remote gp_dump_agent programs to finish.
20150612:21:59:10|gp_dump-[INFO]:-backupsucceeded for dbid 3 on host sdw2
20150612:21:59:10|gp_dump-[INFO]:-Allremote gp_dump_agent programs are finished.
20150612:21:59:10|gp_dump-[INFO]:-Reportresults also written to /home/gpadmin/gp_dump/gp_dump_20150612215908.rpt.
GreenplumDatabase Backup Report
TimestampKey: 20150612215908
gp_dumpCommand Line: --gp-s=i[3] --gp-d=/home/gpadmin/gp_dump/ --gp-r=/home/gpadmin/gp_dump/gpdumptest
Pass throughCommand Line Options: None
CompressionProgram: None
Backup Type:Full
IndividualResults
segment 1 (dbid 3) Host sdw2 Port 40000Database gpdumptest BackupFile/home/gpadmin/gp_dump/gp_dump_0_3_20150612215908: Succeeded
gp_dumputility finished successfully.
--备份文件中的记录,备份了5行数据。
[gpadmin@sdw2 gp_dump]$ cat gp_dump_0_3_20150612215908
--
-- Greenplum Database database dump
--
SET statement_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = off;
SET check_function_bodies = false;
SET client_min_messages = warning;
SET escape_string_warning = off;
SET search_path = public, pg_catalog;
SET default_with_oids = false;
--
-- Data for Name: test; Type: TABLE DATA; Schema: public; Owner:gpadmin
--
COPY test (id, id2) FROM stdin;
2 3
4 5
6 7
8 9
10 11
\.
--
-- Greenplum Database database dump complete
--
4,--均匀恢复该数据到所有主机上。
--如果要删除相应主机,此时删除再进行恢复,本文中的示例并没有删除主机,相应的操作可见另一篇关于删除节点的文章。
--将该数据恢复需要将其传到master主机上,再由master主机串行恢复。
--注意此处恢复是无法使用gp_restore的,而要用以下格式。
[gpadmin@mdwgp_dump]$ psql gpdumptest -f gp_dump_0_3_20150612215908
SET
SET
SET
SET
SET
SET
SET
SET
--多了5行刚才备份后恢复的数据。
gpdumptest=#select * from test;
id | id2
----+-----
2 | 3
4 | 5
6 | 7
8 | 9
10 | 11
2 | 3
4 | 5
6 | 7
8 | 9
10 | 11
1 | 2
3 | 4
5 | 6
7 | 8
9 | 10
(15 rows)
转载请注明出处blog.csdn.net/aabc012