Greenplum安装
评估存储能力
计算可用的空间
步骤1:初始存储能力=硬盘大小硬盘数
步骤2:配置RAID10,格式化磁盘空间=(初始存储能力0.9)/2
步骤3:可用磁盘空间=格式化磁盘空间*0.7
步骤4:用户数据使用空间
使用镜像:(2*用户数据)+用户数据/3=可用磁盘空间
不使用镜像:用户数据+用户数据/3=可用磁盘空间
计算用户数据大小
平均来说,实际占用磁盘空间大小=用户数据1.4
页面开销:32KB页面需要20 bytes
行开销:每行24 bytes,'append-only’表需要4bytes
索引开销:
B-tree:唯一值(数据类型大小+24bytes) Bitmap:(唯一值行数1bit压vi缩比率/8)+(唯一值32)
为元数据和日志计算空间需求
系统元数据:20M
预写日志(WAL):WAL被拆分成多个64M的文件,WAL文件数最多为
2*checkpoint_segments+1,checkpoint_segments默认值为8。也就意味着每个实例需要1088MB的WAL空间
GP数据库日志文件:日志轮转
性能监控数据
实验环境
硬件环境
VMware虚拟机软件10.0
三台Linux虚拟机:Red Hat Enterprise Linux Serverrelease 5.42
数据库:greenplum-db-4.2.8.0-build-1-RHEL5-x86_64.zip
网卡设置
虚拟机配置
基本信息配置
节点
IP
内存
硬盘
master
192.168.80.200
1024M
20G
slave1
192.168.80.201
1024M
20G
slave2
192.168.80.202
1024M
20G
设置IP地址的命令行方式
vi /etc/sysconfig/network-scripts/ifcfg-eth0
填写ip地址、子网掩码、网关、DNS等
DEVICE=eth0
HWADDR=00:0c:29:b7:7d:69
ONBOOT=yes
BOOTPROTO=static
DNS1=192.168.80.200
IPV6INIT=no
USERCTL=no
IPADDR=192.168.80.200
NETMASK=255.255.255.0
GATEWAY=192.168.80.1
关闭防火墙
[root@localhost ~]# service iptables status
表格:filter
Chain INPUT (policy ACCEPT)
num target prot opt source destination
1 RH-Firewall-1-INPUT all – 0.0.0.0/0 0.0.0.0/0
Chain FORWARD (policy ACCEPT)
num target prot opt source destination
1 RH-Firewall-1-INPUT all – 0.0.0.0/0 0.0.0.0/0
Chain OUTPUT (policy ACCEPT)
num target prot opt source destination
Chain RH-Firewall-1-INPUT (2 references)
num target prot opt source destination
1 ACCEPT all – 0.0.0.0/0 0.0.0.0/0
2 ACCEPT icmp – 0.0.0.0/0 0.0.0.0/0 icmp type 255
3 ACCEPT esp – 0.0.0.0/0 0.0.0.0/0
4 ACCEPT ah – 0.0.0.0/0 0.0.0.0/0
5 ACCEPT udp – 0.0.0.0/0 224.0.0.251 udp dpt:5353
6 ACCEPT udp – 0.0.0.0/0 0.0.0.0/0 udp dpt:631
7 ACCEPT tcp – 0.0.0.0/0 0.0.0.0/0 tcp dpt:631
8 ACCEPT all – 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
9 ACCEPT tcp – 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:21
10 ACCEPT tcp – 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:22
11 ACCEPT tcp – 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:23
12 ACCEPT tcp – 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:80
13 ACCEPT tcp – 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:443
14 REJECT all – 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited
[root@localhost ~]# service iptables stop
清除防火墙规则: [确定]
把 chains 设置为 ACCEPT 策略:filter [确定]
正在卸载 Iiptables 模块: [确定]
修改主机名和IP映射:
[root@master ~]# vi /etc/sysconfig/network
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=master
[root@localhost ~]# vi /etc/hosts
Do not remove the following line, or various programs
that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.80.200 master
系统参数配置
包括共享内存、网络、用户限制等,修改或添加/etc/sysctl.conf
Kernel sysctl configuration file for Red Hat Linux
For binary values, 0 is disabled, 1 is enabled. See sysctl(8) and
sysctl.conf(5) for more details.
Controls source route verification
net.ipv4.conf.default.rp_filter = 1
#for greenplum begin
xfs_mount_options = rw,noatime,inode64,allocsize=16m
Controls the maximum shared segment size, in bytes
kernel.shmmax = 500000000
Controls the minimum shared segment size, in bytes
kernel.shmmni = 4096
Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4000000000
kernel.sem = 250 512000 100 2048
Controls the System Request debugging functionality of the kernel
kernel.sysrq = 1
Controls whether core dumps will append the PID to the core filename
Useful for debugging multi-threaded applications
kernel.core_uses_pid = 1
Controls the maximum size of a message, in bytes
kernel.msgmnb = 65536
Controls the default maxmimum size of a mesage queue
kernel.msgmax = 65536
Controls the default minmimum size of a mesage queue
kernel.msgmni = 2048
Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1
Controls IP packet forwarding
net.ipv4.ip_forward = 0
Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.ip_local_port_range = 1025 65535
net.core.netdev_max_backlog = 10000
vm.overcommit_memory = 2
#for greenplum end
系统设置(所有节点)
配置/etc/security/limits.conf文件
for greenplum begin
- soft nofile 65536
- hard nofile 65536
- soft nproc 131072
- hard nproc 131072
for greenplum end
设置磁盘访问I/O调度策略
说明:介绍文章:http://blog.csdn.net/theorytree/article/details/6259104
deadline的调度策略可以平衡IO调度和进程调度,不会造成等待进程频繁的等待
[root@master ~] # cat /sys/block/sda/queue/scheduler
noop anticipatory deadline [cfq]
[root@master ~] # cat /sys/block/fd0/queue/scheduler
noop anticipatory deadline [cfq]
[root@master ~]# dmesg | grep -i scheduler
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
[root@master ~]# echo deadline > /sys/block/sda/queue/scheduler
[root@master ~]# echo deadline > /sys/block/fd0/queue/scheduler
[root@master ~]# cat /sys/block/sda/queue/scheduler
noop anticipatory [deadline] cfq
[root@master ~]# cat /sys/block/fd0/queue/scheduler
noop anticipatory [deadline] cfq
设置预读块的值为16384;
16384:数据仓库的最大特点是用于保存历史数据,存在大量的数据操作,包括增删改查,当设置的块越大时读取性能越高;16384是greenplum数据库要求的最低要求。
[root@master ~]# /sbin/blockdev --setra 16384 /dev/sda
[root@master ~]# /sbin/blockdev --setra 16384 /dev/sda1
[root@master ~]# /sbin/blockdev --setra 16384 /dev/sda2
[root@master ~]# /sbin/blockdev --setra 16384 /dev/sda3
[root@master ~]# fdisk -l
Disk /dev/sda: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 * 1 653 5245191 83 Linux
/dev/sda2 654 914 2096482+ 82 Linux swap / Solaris
/dev/sda3 915 1045 1052257+ 83 Linux
修改/etc/hosts,添加如下内容
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.80.200 master
192.168.80.201 slave1
192.168.80.202 slave2
在Master节点上安装Greenplum软件
准备介质
下载地址:http://gopivotal.com/products/pivotal-greenplum-database
解压
#unzip greenplum-db-4.2.8.0-build-1-RHEL5-x86_64.zip
安装软件
#/bin/bash greenplum-db-4.2.8.0-build-1-RHEL5-x86_64.bin
I HAVE READ AND AGREE TO THE TERMS OF THE ABOVE PIVOTAL SOFTWARE
LICENSE AGREEMENT.
Do you accept the Pivotal Database license agreement? [yes|no]
yes
Provide the installation path for Greenplum Database or press ENTER to
accept the default installation path: /usr/local/greenplum-db-4.2.8.0
Install Greenplum Database into </usr/local/greenplum-db-4.2.8.0>? [yes|no]
yes
/usr/local/greenplum-db-4.2.8.0 does not exist.
Create /usr/local/greenplum-db-4.2.8.0 ? [yes|no]
(Selecting no will exit the installer)
yes
[Optional] Provide the path to a previous installation of Greenplum Database,
or press ENTER to skip this step. e.g. /usr/local/greenplum-db-4.1.1.3
This installation step will migrate any Greenplum Database extensions from the
provided path to the version currently being installed. This step is optional
and can be run later with:
gppkg --migrate <path_to_old_gphome> /usr/local/greenplum-db-4.2.8.0
Extracting product to /usr/local/greenplum-db-4.2.8.0
Skipping migration of Greenplum Database extensions…
Installation complete.
Greenplum Database is installed in /usr/local/greenplum-db-4.2.8.0
Pivotal Greenplum documentation is available
for download at http://docs.gopivotal.com/gpdb
安装结果
[root@master greenplum-db-4.2.8.0]# pwd
/usr/local/greenplum-db-4.2.8.0
[root@master greenplum-db-4.2.8.0]# ls -ltr
total 328
drwxr-xr-x 3 510 510 4096 Jun 18 2014 share
drwxr-xr-x 2 510 510 4096 Jun 18 2014 demo
drwxr-xr-x 5 510 510 4096 Jun 18 2014 docs
drwxr-xr-x 7 510 510 4096 Jun 18 2014 lib
drwxr-xr-x 6 510 510 4096 Jun 18 2014 include
drwxr-xr-x 3 510 510 4096 Jun 18 2014 ext
drwxr-xr-x 2 510 510 4096 Jun 18 2014 etc
drwxr-xr-x 2 510 510 4096 Jun 18 2014 sbin
-rw-rw-r-- 1 510 510 193083 Jun 18 2014 LICENSE.thirdparty
-rw-rw-r-- 1 510 510 43025 Jun 18 2014 GPDB-LICENSE.txt
drwxr-xr-x 3 510 510 4096 Jun 18 2014 bin
-rw-r–r-- 1 root root 676 Mar 20 19:41 greenplum_path.sh
文件说明:
greenplum_path.sh:Greenplum数据库环境变量文件
GPDB-LICENSE.txt:Greenplum许可协议
bin:管理工具、客户端程序及服务程序
demo:示例程序
docs:帮助文档
etc:OpenSSL的配置示例
ext:一些GP工具使用的捆绑程序
inlcude:C头文件
lib:库文件
sbin:支持的或者内部的脚本和程序
share:共享文件
在所有主机上初始化配置Greenplum
获取环境参数:
source/usr/local/greenplum-db/greenplum_path.sh
GPHOME=/usr/local/greenplum-db-4.2.8.0
Replace with symlink path if it is present and correct
if [ -h ${GPHOME}/…/greenplum-db ]; then
GPHOME_BY_SYMLINK=(cd ${GPHOME}/../greenplum-db/ && pwd -P)
if [ x"
G
P
H
O
M
E
B
Y
S
Y
M
L
I
N
K
"
=
x
"
{GPHOME_BY_SYMLINK}" = x"
GPHOMEBYSYMLINK"=x"{GPHOME}" ]; then
GPHOME=(cd ${GPHOME}/../greenplum-db/ && pwd -L)
/.
fi
unset GPHOME_BY_SYMLINK
fi
PATH=
G
P
H
O
M
E
/
b
i
n
:
GPHOME/bin:
GPHOME/bin:GPHOME/ext/python/bin:
P
A
T
H
L
D
L
I
B
R
A
R
Y
P
A
T
H
=
PATH LD_LIBRARY_PATH=
PATHLDLIBRARYPATH=GPHOME/lib:
G
P
H
O
M
E
/
e
x
t
/
p
y
t
h
o
n
/
l
i
b
:
GPHOME/ext/python/lib:
GPHOME/ext/python/lib:LD_LIBRARY_PATH
PYTHONPATH=
G
P
H
O
M
E
/
l
i
b
/
p
y
t
h
o
n
P
Y
T
H
O
N
H
O
M
E
=
GPHOME/lib/python PYTHONHOME=
GPHOME/lib/pythonPYTHONHOME=GPHOME/ext/python
OPENSSL_CONF=$GPHOME/etc/openssl.cnf
export GPHOME
export PATH
export LD_LIBRARY_PATH
export PYTHONPATH
export PYTHONHOME
export OPENSSL_CONF
创建主机文件all_hosts
文件内容:
master
slave1
slave2
env
[root@master greenplum-db-4.2.8.0]# env
HOSTNAME=master
TERM=vt100
SHELL=/bin/bash
HISTSIZE=1000
SSH_CLIENT=192.168.80.120 49852 22
GPHOME=/usr/local/greenplum-db/.
OLDPWD=/root/installer
SSH_TTY=/dev/pts/2
USER=root
LD_LIBRARY_PATH=/usr/local/greenplum-db/./lib:/usr/local/greenplum-db/./ext/python/lib:
LS_COLORS=no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;32:.cmd=01;32:.exe=01;32:.com=01;32:.btm=01;32:.bat=01;32:.sh=01;32:.csh=01;32:.tar=01;31:.tgz=01;31:.arj=01;31:.taz=01;31:.lzh=01;31:.zip=01;31:.z=01;31:.Z=01;31:.gz=01;31:.bz2=01;31:.bz=01;31:.tz=01;31:.rpm=01;31:.cpio=01;31:.jpg=01;35:.gif=01;35:.bmp=01;35:.xbm=01;35:.xpm=01;35:.png=01;35:.tif=01;35:
MAIL=/var/spool/mail/root
PATH=/usr/local/greenplum-db/./bin:/usr/local/greenplum-db/./ext/python/bin:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin
INPUTRC=/etc/inputrc
PWD=/usr/local/greenplum-db-4.2.8.0
LANG=en_US.UTF-8
PYTHONHOME=/usr/local/greenplum-db/./ext/python
SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
SHLVL=1
HOME=/root
OPENSSL_CONF=/usr/local/greenplum-db/./etc/openssl.cnf
PYTHONPATH=/usr/local/greenplum-db/./lib/python
LOGNAME=root
CVS_RSH=ssh
SSH_CONNECTION=192.168.80.120 49852 192.168.80.200 22
LESSOPEN=|/usr/bin/lesspipe.sh %s
G_BROKEN_FILENAMES=1
_=/bin/env
[root@master greenplum-db-4.2.8.0]# ssh slave1
The authenticity of host ‘slave1 (192.168.80.201)’ can’t be established.
RSA key fingerprint is 04:6d:9c:fc:68:e3:9a:24:dc:11:ff:25:14:9e:d1:5b.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘slave1,192.168.80.201’ (RSA) to the list of known hosts.
root@slave1’s password:
Last login: Fri Mar 20 17:49:51 2015 from 192.168.80.120
运行gpseginstall工具
gpseginstall -f all_hosts-ugpadmin -p gpadmin
[root@master greenplum-db]# gpseginstall -f all_hosts -u gpadmin -p gpadmin
20150320:19:58:32:003371 gpseginstall:master:root-[INFO]:-Installation Info:
link_name greenplum-db
binary_path /usr/local/greenplum-db-4.2.8.0
binary_dir_location /usr/local
binary_dir_name greenplum-db-4.2.8.0
20150320:19:58:32:003371 gpseginstall:master:root-[INFO]:-check cluster password access
*** Enter password for slave1:
20150320:19:58:38:003371 gpseginstall:master:root-[INFO]:-de-duplicate hostnames
20150320:19:58:38:003371 gpseginstall:master:root-[INFO]:-master hostname: master
20150320:19:58:38:003371 gpseginstall:master:root-[INFO]:-check for user gpadmin on cluster
20150320:19:58:38:003371 gpseginstall:master:root-[INFO]:-add user gpadmin on master
20150320:19:58:41:003371 gpseginstall:master:root-[INFO]:-add user gpadmin on cluster
20150320:19:58:42:003371 gpseginstall:master:root-[INFO]:-chown -R gpadmin:gpadmin /usr/local/greenplum-db
20150320:19:58:43:003371 gpseginstall:master:root-[INFO]:-chown -R gpadmin:gpadmin /usr/local/greenplum-db-4.2.8.0
20150320:19:58:43:003371 gpseginstall:master:root-[INFO]:-rm -f /usr/local/greenplum-db-4.2.8.0.tar; rm -f /usr/local/greenplum-db-4.2.8.0.tar.gz
20150320:19:58:43:003371 gpseginstall:master:root-[INFO]:-cd /usr/local; tar cf greenplum-db-4.2.8.0.tar greenplum-db-4.2.8.0
20150320:19:58:50:003371 gpseginstall:master:root-[INFO]:-gzip /usr/local/greenplum-db-4.2.8.0.tar
20150320:19:59:04:003371 gpseginstall:master:root-[INFO]:-remote command: mkdir -p /usr/local
20150320:19:59:04:003371 gpseginstall:master:root-[INFO]:-remote command: rm -rf /usr/local/greenplum-db-4.2.8.0
20150320:19:59:04:003371 gpseginstall:master:root-[INFO]:-scp software to remote location
20150320:19:59:34:003371 gpseginstall:master:root-[INFO]:-remote command: gzip -f -d /usr/local/greenplum-db-4.2.8.0.tar.gz
20150320:19:59:39:003371 gpseginstall:master:root-[INFO]:-md5 check on remote location
20150320:19:59:41:003371 gpseginstall:master:root-[INFO]:-remote command: cd /usr/local; tar xf greenplum-db-4.2.8.0.tar
20150320:19:59:59:003371 gpseginstall:master:root-[INFO]:-remote command: rm -f /usr/local/greenplum-db-4.2.8.0.tar
20150320:20:00:01:003371 gpseginstall:master:root-[INFO]:-remote command: cd /usr/local; rm -f greenplum-db; ln -fs greenplum-db-4.2.8.0 greenplum-db
20150320:20:00:01:003371 gpseginstall:master:root-[INFO]:-remote command: chown -R gpadmin:gpadmin /usr/local/greenplum-db
20150320:20:00:02:003371 gpseginstall:master:root-[INFO]:-remote command: chown -R gpadmin:gpadmin /usr/local/greenplum-db-4.2.8.0
20150320:20:00:02:003371 gpseginstall:master:root-[INFO]:-rm -f /usr/local/greenplum-db-4.2.8.0.tar.gz
20150320:20:00:02:003371 gpseginstall:master:root-[INFO]:-Changing system passwords …
20150320:20:00:05:003371 gpseginstall:master:root-[INFO]:-exchange ssh keys for user root
20150320:20:00:16:003371 gpseginstall:master:root-[INFO]:-exchange ssh keys for user gpadmin
20150320:20:00:18:003371 gpseginstall:master:root-[INFO]:-/usr/local/greenplum-db/./sbin/gpfixuserlimts -f /etc/security/limits.conf -u gpadmin
20150320:20:00:20:003371 gpseginstall:master:root-[INFO]:-remote command: . /usr/local/greenplum-db/./greenplum_path.sh; /usr/local/greenplum-db/./sbin/gpfixuserlimts -f /etc/security/limits.conf -u gpadmin
20150320:20:00:21:003371 gpseginstall:master:root-[INFO]:-version string on master: gpssh version 4.2.8.0 build 1
20150320:20:00:21:003371 gpseginstall:master:root-[INFO]:-remote command: . /usr/local/greenplum-db/./greenplum_path.sh; /usr/local/greenplum-db/./bin/gpssh --version
20150320:20:00:21:003371 gpseginstall:master:root-[INFO]:-remote command: . /usr/local/greenplum-db-4.2.8.0/greenplum_path.sh; /usr/local/greenplum-db-4.2.8.0/bin/gpssh --version
20150320:20:00:27:003371 gpseginstall:master:root-[INFO]:-SUCCESS – Requested commands completed
在所有主机上初始化配置Greenplum
切换到gpamdin用户并获取环境变量
$ su - gpadmin
# source/usr/local/greenplum-db/greenplum_path.sh
使用gpssh工具来测试无密码登录所有主机
$ gpssh -f host_list -e ls -l$GPHOME
[gpadmin@master greenplum-db]$ gpssh -f all_hosts -e ls -l $GPHOME
[master] ls -l /usr/local/greenplum-db/.
[master] total 336
…
[slave1] ls -l /usr/local/greenplum-db/.
[slave1] total 336
…
[slave2] ls -l /usr/local/greenplum-db/.
[slave2] total 336
…
将"./usr/local/greenplum-db-4.2.2.4/greenplum_path.sh"添加到.bashrc文件最后并传送到子服务器
[gpadmin@master ~]$ cat .bashrc
.bashrc
Source global definitions
if [ -f /etc/bashrc ]; then
. /etc/bashrc
fi
. /usr/local/greenplum-db/greenplum_path.sh
User specific aliases and functions
[gpadmin@master ~]$ env
HOSTNAME=master
SHELL=/bin/bash
TERM=vt100
HISTSIZE=1000
GPHOME=/usr/local/greenplum-db/.
USER=gpadmin
LD_LIBRARY_PATH=/usr/local/greenplum-db/./lib:/usr/local/greenplum-db/./ext/python/lib:
LS_COLORS=no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;32:.cmd=01;32:.exe=01;32:.com=01;32:.btm=01;32:.bat=01;32:.sh=01;32:.csh=01;32:.tar=01;31:.tgz=01;31:.arj=01;31:.taz=01;31:.lzh=01;31:.zip=01;31:.z=01;31:.Z=01;31:.gz=01;31:.bz2=01;31:.bz=01;31:.tz=01;31:.rpm=01;31:.cpio=01;31:.jpg=01;35:.gif=01;35:.bmp=01;35:.xbm=01;35:.xpm=01;35:.png=01;35:.tif=01;35:
MAIL=/var/spool/mail/gpadmin
PATH=/usr/local/greenplum-db/./bin:/usr/local/greenplum-db/./ext/python/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/home/gpadmin/bin
INPUTRC=/etc/inputrc
PWD=/home/gpadmin
LANG=en_US.UTF-8
PYTHONHOME=/usr/local/greenplum-db/./ext/python
SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
SHLVL=1
HOME=/home/gpadmin
OPENSSL_CONF=/usr/local/greenplum-db/./etc/openssl.cnf
PYTHONPATH=/usr/local/greenplum-db/./lib/python
LOGNAME=gpadmin
CVS_RSH=ssh
LESSOPEN=|/usr/bin/lesspipe.sh %s
G_BROKEN_FILENAMES=1
_=/bin/env
[gpadmin@master ~]$ scp .bashrc slave1:pwd
.bashrc 100% 167 0.2KB/s 00:00
[gpadmin@master ~]$ scp .bashrc slave2:pwd
.bashrc 100% 167 0.2KB/s 00:00
验证操作系统设置
创建存储区域
a) 创建Master数据存储区域
# mkdir -p /data/master
b) 改变目录的所有权
# chown gpadmin /data/master
c) 创建一个包含所有segment主机的文件seg_hosts
[root@master ~]# mkdir /tmp/greenplum
[root@master ~]# cd /tmp/greenplum/
[root@master greenplum]# vi seg_hosts
slave1
slave2
“seg_hosts” [New] 2L, 14C written
[root@master greenplum]# cat seg_hosts
slave1
slave2
d) 使用gpssh工具在所有segment主机上创建主数据和镜像数据目录
[root@master greenplum]# . /usr/local/greenplum-db/greenplum_path.sh
[root@master greenplum]# gpssh -f seg_hosts -e ‘mkdir -p /data/primary’
[slave1] mkdir -p /data/primary
[slave2] mkdir -p /data/primary
[root@master greenplum]# gpssh -f seg_hosts -e ‘mkdir -p /data/mirror’
[slave1] mkdir -p /data/mirror
[slave2] mkdir -p /data/mirror
[root@master greenplum]# gpssh -f seg_hosts -e ‘chown gpadmin /data/primary’
[slave1] chown gpadmin /data/primary
[slave2] chown gpadmin /data/primary
[root@master greenplum]# gpssh -f seg_hosts -e ‘chown gpadmin /data/mirror’
[slave1] chown gpadmin /data/mirror
[slave2] chown gpadmin /data/mirror
同步系统时间
a) 在Master主机上编辑/etc/ntp.conf来设置如下内容:
[root@master greenplum]# cat /etc/ntp.conf
Permit time synchronization with our time source, but do not
permit the source to query or modify the service on this system.
restrict default kod nomodify notrap nopeer noquery
restrict -6 default kod nomodify notrap nopeer noquery
Permit all access over the loopback interface. This could
be tightened as well, but to do so would effect some of
the administrative functions.
restrict 127.0.0.1
restrict -6 ::1
Hosts on local network are less restricted.
#restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap
Use public servers from the pool.ntp.org project.
Please consider joining the pool (http://www.pool.ntp.org/join.html).
server 0.rhel.pool.ntp.org
server 1.rhel.pool.ntp.org
server 2.rhel.pool.ntp.org
#broadcast 192.168.1.255 key 42 # broadcast server
#broadcastclient # broadcast client
#broadcast 224.0.1.1 key 42 # multicast server
#multicastclient 224.0.1.1 # multicast client
#manycastserver 239.255.254.254 # manycast server
#manycastclient 239.255.254.254 key 42 # manycast client
Undisciplined Local Clock. This is a fake driver intended for backup
and when no outside source of synchronized time is available.
server 127.127.1.0 # local clock
fudge 127.127.1.0 stratum 10
Drift file. Put this in a directory which the daemon can write to.
No symbolic links allowed, either, since the daemon updates the file
by creating a temporary in the same directory and then rename()'ing
it to the file.
driftfile /var/lib/ntp/drift
Key file containing the keys and key identifiers used when operating
with symmetric key cryptography.
keys /etc/ntp/keys
Specify the key identifiers which are trusted.
#trustedkey 4 8 42
Specify the key identifier to use with the ntpdc utility.
#requestkey 8
Specify the key identifier to use with the ntpq utility.
#controlkey 8
b) 在Segment主机上编辑/etc/ntp.conf
Undisciplined Local Clock. This is a fake driver intended for backup
and when no outside source of synchronized time is available.
#server 127.127.1.0 # local clock
server master # local clock
fudge 127.127.1.0 stratum 10
c) 在Master主机上,通过NTP守护进程同步系统时钟
[root@master greenplum]# gpssh -f all_hosts -v -e ‘ntpd’
[Reset …]
[INFO] login master
[INFO] login slave1
[INFO] login slave2
[master] ntpd
[slave1] ntpd
[slave2] ntpd
[INFO] completed successfully
[Cleanup…]
验证操作系统设置
[root@master greenplum]# gpcheck -f all_hosts -m master
…
20150320:21:20:17:004588 gpcheck:master:root-[INFO]:-gpcheck completing…
[root@master greenplum]# gpcheck -f all_hosts -m master
…
20150320:21:23:52:004727 gpcheck:master:root-[INFO]:-gpcheck completing…
[root@master greenplum]# echo deadline > /sys/block/sr0/queue/scheduler
[root@master greenplum]# gpcheck -f all_hosts -m master
…
20150320:21:27:50:004866 gpcheck:master:root-[INFO]:-gpcheck completing…
创建Greenplum数据库配置文件
初始化Greenplum数据库系统
a) 以gpadmin用户登录
# su - gpadmin
b) 从模板中拷贝一份gpinitsystem_config文件
[gpadmin@master ~]$ cp
G
P
H
O
M
E
/
d
o
c
s
/
c
l
i
h
e
l
p
/
g
p
c
o
n
f
i
g
s
/
g
p
i
n
i
t
s
y
s
t
e
m
c
o
n
f
i
g
/
h
o
m
e
/
g
p
a
d
m
i
n
/
[
g
p
a
d
m
i
n
@
m
a
s
t
e
r
]
GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/ [gpadmin@master ~]
GPHOME/docs/clihelp/gpconfigs/gpinitsystemconfig/home/gpadmin/[gpadmin@master ] chmod 775 /home/gpadmin/gpinitsystem_config
c) 设置所有必须的参数以及可选参数
[gpadmin@master ~]$ vi gpinitsystem_config
FILE NAME: gpinitsystem_config
Configuration file needed by the gpinitsystem
################################################
REQUIRED PARAMETERS
################################################
Name of this Greenplum system enclosed in quotes.
ARRAY_NAME=“EMC Greenplum DW”
Naming convention for utility-generated data directories.
SEG_PREFIX=gpseg
Base number by which primary segment port numbers
are calculated.
PORT_BASE=40000
declare -a DATA_DIRECTORY=(/data/primary )
OS-configured hostname or IP address of the master host.
MASTER_HOSTNAME=master
File system location where the master data directory
will be created.
MASTER_DIRECTORY=/data/master
Port number for the master instance.
MASTER_PORT=5432
Shell utility used to connect to remote hosts.
TRUSTED_SHELL=ssh
Maximum log file segments between automatic WAL checkpoints.
CHECK_POINT_SEGMENTS=8
Default server-side character set encoding.
ENCODING=UNICODE
################################################
OPTIONAL MIRROR PARAMETERS
################################################
Base number by which mirror segment port numbers
are calculated.
MIRROR_PORT_BASE=50000
Base number by which primary file replication port
numbers are calculated.
REPLICATION_PORT_BASE=41000
Base number by which mirror file replication port
numbers are calculated.
MIRROR_REPLICATION_PORT_BASE=51000
File system location(s) where mirror segment data directories
will be created. The number of mirror locations must equal the
number of primary locations as specified in the
DATA_DIRECTORY parameter.
declare -a MIRROR_DATA_DIRECTORY=(/data/mirror)
################################################
OTHER OPTIONAL PARAMETERS
################################################
Create a database of this name after initialization.
#DATABASE_NAME=name_of_database
Specify the location of the host address file here instead of
with the the -h option of gpinitsystem.
#MACHINE_LIST_FILE=/home/gpadmin/gpconfigs/hostfile_gpinitsystem
初始化数据库
a) 运行初始化工具
[gpadmin@master ~]$ cp /tmp/greenplum/seg_hosts .
[gpadmin@master ~]$ ls
gpinitsystem_config seg_hosts
[gpadmin@master ~]$ gpinitsystem -c gpinitsystem_config -h seg_hosts
20150320:21:53:15:005103 gpinitsystem:master:gpadmin-[FATAL]:-Errors generated from parallel processes
20150320:21:53:15:005103 gpinitsystem:master:gpadmin-[INFO]:-Dumped contents of status file to the log file
20150320:21:53:15:005103 gpinitsystem:master:gpadmin-[INFO]:-Building composite backout file
20150320:21:53:15:gpinitsystem:master:gpadmin-[FATAL]:-Failures detected, see log file /home/gpadmin/gpAdminLogs/gpinitsystem_20150320.log for more detail Script Exiting!
20150320:21:53:15:005103 gpinitsystem:master:gpadmin-[WARN]:-Script has left Greenplum Database in an incomplete state
20150320:21:53:15:005103 gpinitsystem:master:gpadmin-[WARN]:-Run command /bin/bash /home/gpadmin/gpAdminLogs/backout_gpinitsystem_gpadmin_20150320_215158 to remove these changes
20150320:21:53:15:005103 gpinitsystem:master:gpadmin-[INFO]:-Start Function BACKOUT_COMMAND
20150320:21:53:15:005103 gpinitsystem:master:gpadmin-[INFO]:-End Function BACKOUT_COMMAND
经过重新同步各服务器之间的时间后初始化正确:
[gpadmin@master ~]$ gpinitsystem -c gpinitsystem_config -h seg_hosts
20150321:14:23:03:032514 gpstart:master:gpadmin-[INFO]:-----------------------------------------------------
20150321:14:23:03:032514 gpstart:master:gpadmin-[INFO]:-Successfully started 2 of 2 segment instances
20150321:14:23:03:032514 gpstart:master:gpadmin-[INFO]:-----------------------------------------------------
20150321:14:23:03:032514 gpstart:master:gpadmin-[INFO]:-Starting Master instance master directory /data/master/gpseg-1
20150321:14:23:04:032514 gpstart:master:gpadmin-[INFO]:-Command pg_ctl reports Master master instance active
当出现以下信息时初始化成功
[gpadmin@master ~]$ ps -ef|grep post
gpadmin 1444 15784 0 14:27 pts/1 00:00:00 grep post
gpadmin 32570 1 0 14:23 ? 00:00:00 /usr/local/greenplum-db-4.2.8.0/bin/postgres -D /data/master/gpseg-1 -p 5432 -b 1 -z 2 --silent-mode=true -i -M master -C -1 -x 0 -E
gpadmin 32571 32570 0 14:23 ? 00:00:00 postgres: port 5432, master logger process
gpadmin 32574 32570 0 14:23 ? 00:00:00 postgres: port 5432, stats collector process
gpadmin 32575 32570 0 14:23 ? 00:00:00 postgres: port 5432, writer process
gpadmin 32576 32570 0 14:23 ? 00:00:00 postgres: port 5432, checkpoint process
gpadmin 32577 32570 0 14:23 ? 00:00:00 postgres: port 5432, seqserver process
gpadmin 32578 32570 0 14:23 ? 00:00:00 postgres: port 5432, WAL Send Server process
gpadmin 32579 32570 0 14:23 ? 00:00:00 postgres: port 5432, ftsprobe process
gpadmin 32580 32570 0 14:23 ? 00:00:00 postgres: port 5432, sweeper process
b) 设置环境变量
添加“export MASTER_DATA_DIRECTORY=/data/master/gpseg-1”到~/.bashrc文件尾,并同步到其他节点。
[gpadmin@master ~]$ cat .bashrc
.bashrc
Source global definitions
if [ -f /etc/bashrc ]; then
. /etc/bashrc
fi
. /usr/local/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/data/master/gpseg-1
User specific aliases and functions
[gpadmin@master ~]$ scp .bashrc slave1:pwd
.bashrc 100% 217 0.2KB/s 00:00
[gpadmin@master ~]$ scp .bashrc slave2:pwd
.bashrc 100% 217 0.2KB/s 00:00
启动和停止数据库
a) 启动数据库 $ gpstart
[gpadmin@master ~]$ gpstart
20150321:14:49:18:007507 gpstart:master:gpadmin-[INFO]:-Starting gpstart with args:
…
20150321:14:49:26:007507 gpstart:master:gpadmin-[INFO]:-Database successfully started
[gpadmin@master ~]$ psql -d template1
psql (8.2.15)
Type “help” for help.
template1=#
注意:需要检查防火墙是否关闭否则,会有错误。
b) 关闭数据库
$ gpstop
[gpadmin@master ~]$ gpstop
20150321:14:38:08:004265 gpstop:master:gpadmin-[INFO]:-Starting gpstop with args:
Continue with Greenplum instance shutdown Yy|Nn (default=N):
y
20150321:14:38:14:004265 gpstop:master:gpadmin-[INFO]:-Successfully shutdown 2 of 2 segment instances
20150321:14:38:14:004265 gpstop:master:gpadmin-[INFO]:-Database successfully shutdown with no errors reported