也许我们在设计系统架构的已经评估过了集群的大小,可是现实往往是我们不能预料的,当来自业务的压力越来越大的时候,系统已经变得不堪重负,这时也许我们就该对其进行扩容了。
在GP里通过使用GPEXPAND工具可以帮助我们对现有集群进行扩充。
整个过程大致分为以下几个阶段:
1、Preparing
准备机器,配置好软件环境,使用GP自带的一些工具,例如checkos进行检测。为新添SEGMENT建立相应的目录,注意与现有集群保持一致。
2、Initializing New Segments
在这个阶段需要一个input file,你可以手动创建或者通过gpexpand进行交互式的创建,ADMIN文档上写着GP推荐使用后者。详细的步骤可以参考ADMIN文档,这里我就把我的操作记录下来吧。
//现有集群的一个大体情况
[gpadmin1@hadoop1 conf]$ gpstate -c
20101029:14:19:59:gpstate:hadoop1:gpadmin1-[INFO]:-Starting gpstate with args: -c
20101029:14:20:00:gpstate:hadoop1:gpadmin1-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.0.1.0 build 1'
20101029:14:20:00:gpstate:hadoop1:gpadmin1-[INFO]:-Obtaining Segment details from master...
20101029:14:20:00:gpstate:hadoop1:gpadmin1-[INFO]:--------------------------------------------------------------
20101029:14:20:00:gpstate:hadoop1:gpadmin1-[INFO]:--Current GPDB mirror list and status
20101029:14:20:00:gpstate:hadoop1:gpadmin1-[INFO]:--Type = Group
20101029:14:20:00:gpstate:hadoop1:gpadmin1-[INFO]:--------------------------------------------------------------
20101029:14:20:00:gpstate:hadoop1:gpadmin1-[INFO]:- Status Data State Primary Datadir Port Mirror Datadir Port
20101029:14:20:00:gpstate:hadoop1:gpadmin1-[INFO]:- Primary Active, Mirror Available Synchronized hadoop1 /home/gpadmin1/gpdatap1/aligp0 30000 hadoop2 /home/gpadmin1/gpdatam1/aligp0 40000
20101029:14:20:00:gpstate:hadoop1:gpadmin1-[INFO]:- Primary Active, Mirror Available Synchronized hadoop1 /home/gpadmin1/gpdatap2/aligp1 30001 hadoop2 /home/gpadmin1/gpdatam2/aligp1 40001
20101029:14:20:00:gpstate:hadoop1:gpadmin1-[INFO]:- Primary Active, Mirror Available Synchronized hadoop2 /home/gpadmin1/gpdatap1/aligp2 30000 hadoop3 /home/gpadmin1/gpdatam1/aligp2 40000
20101029:14:20:00:gpstate:hadoop1:gpadmin1-[INFO]:- Primary Active, Mirror Available Synchronized hadoop2 /home/gpadmin1/gpdatap2/aligp3 30001 hadoop3 /home/gpadmin1/gpdatam2/aligp3 40001
20101029:14:20:00:gpstate:hadoop1:gpadmin1-[INFO]:- Primary Active, Mirror Available Synchronized hadoop3 /home/gpadmin1/gpdatap1/aligp4 30000 hadoop1 /home/gpadmin1/gpdatam1/aligp4 40000
20101029:14:20:00:gpstate:hadoop1:gpadmin1-[INFO]:- Primary Active, Mirror Available Synchronized hadoop3 /home/gpadmin1/gpdatap2/aligp5 30001 hadoop1 /home/gpadmin1/gpdatam2/aligp5 40001
20101029:14:20:00:gpstate:hadoop1:gpadmin1-[INFO]:--------------------------------------------------------------
//通过gpexpand命令交互式创建input file
[gpadmin1@hadoop1 conf]$ gpexpand
20101029:14:21:33:gpexpand:hadoop1:gpadmin1-[INFO]:-Querying gpexpand schema for current expansion state
System Expansion is used to add segments to an existing GPDB array.
gpexpand did not detect a System Expansion that is in progress.
Before initiating a System Expansion, you need to provision and burn-in
the new hardware. Please be sure to run gpcheckperf/gpcheckos to make
sure the new hardware is working properly.
Please refer to the Admin Guide for more information.
Would you like to initiate a new System Expansion Yy|Nn (default=N):
> y
This utility can handle some expansion scenarios by asking a few questions.
More complex expansions can be done by providing an input file with
the --input <file>. Please see the docs for the format of this file.
The current system appears to be non-standard.
The address value for hadoop2 does not correspond to a standard address.
gpexpand may not be able to symmetrically distribute the new segments appropriately.
It is recommended that you specify your own input file with appropriate values.
Are you sure you want