在CentOS7.2上部署Postgres-XL分布式数据库

最新推荐文章于 2023-08-29 21:17:41 发布

皓月如我

最新推荐文章于 2023-08-29 21:17:41 发布

阅读量1.3k

点赞数 2

分类专栏：云大物移智 PostgreSQL

本文链接：https://blog.csdn.net/fm0517/article/details/81329309

版权

PostgreSQL 同时被 2 个专栏收录

38 篇文章 1 订阅

订阅专栏

云大物移智

25 篇文章 0 订阅

订阅专栏

1. 下载安装包

移步这里下载：https://download.csdn.net/download/fm0517/10577703

2. 节点分类及说明

【GTM】
全局事务控制节点，保证集群数据的一致性，与Coordinator节点和Datanode节点不断通信，是整个集群的核心节点，只存在一个，可以存在一个GTM Standby节点，对GTM实时备份。GTM一旦故障，整个集群立刻无法访问，此时可以切换到GTM Standby节点上。如果部署了GTM Standby节点，就应该同时部署GTM Proxy，一般和Coordinator、Datanode部署在同一台服务器上。GTM Proxy的作用代理Coordinator和Datanode对GTM的访问，起到减轻GTM负载的作用，另外一个重要的作用是帮助完成GTM的故障切换，当GTM节点发生故障后，GTM Standby成为新的GTM，此时Coordinator和Datanode节点并不需要重新指定GTM地址，只需要GTM Proxy重新连接到新的GTM地址即可。

【Coordinator】
接收数据访问请求的节点，本质上是由PG后台进程组成。接收的一条查询后，Coordinator节点执行查询计划，然后会根据查询数据涉及的数据节点将查询分发给相关的数据节点。写入数据时，也会根据不同的数据分布策略将数据写入相关的节点。可以说Coordinator节点上保存着集群的全局数据位置。Coordinator节点可以任意扩展，各个节点之间除了访问地址不同以外是完全对等的，通过一个节点更新的数据可以在另一个节点上立刻看到。每个Coordinator节点可以配置一个对应的standby节点，避免单点故障。

【Datanode】
实际存取数据的节点，接收Coordinator的请求并执行SQL语句存取数据，节点之间也会互相通信。一般的，一个节点上的数据并不是全局的，数据节点不直接对外提供数据访问。一个表的数据在数据节点上的分布存在两种模式：复制模式和分片模式，复制模式下，一个表的数据在指定的节点上存在多个副本；分片模式下，一个表的数据按照一定的规则分布在多个数据节点上，这些节点共同保存一份完整的数据。这两种模式的选择是在创建表的时候执行CREATE TABLE语句指定的，具体语法如下：

CREATE TABLE table_name(...)
DISTRIBUTE BY 
HASH(col)|MODULO(col)|ROUNDROBIN|REPLICATION
TO NODE(nodename1,nodename2...)

可以看到，如果DISTRIBUTE BY 后面是REPLICATION，则是复制模式，其余则是分片模式，HASH指的是按照指定列的哈希值分布数据，MODULO指的是按照指定列的取摩运算分布数据，ROUNDROBIN指的是按照轮询的方式分布数据。TO NODE指定了数据分布的节点范围，如果没有指定则默认所有数据节点参与数据分布。如果没有指定分布模式，即使用普通的CREATE TABLE语句，PGXL会默认采用分片模式将数据分布到所有数据节点。

3. 主机规划

GTM： server0： 192.168.51.140
Coordinator： server1： 192.168.51.141
Datanode1 Master： server2： 192.168.51.142
Datanode2 Master： server3： 192.168.51.143
Datanode1 Slave： server4： 192.168.51.144
Datanode2 Slave： server5： 192.168.51.145

—–Datanode3用于后面演示动态增删节点—–
Datanode3 Master： server6： 192.168.51.146

【下面的步骤4-13，在所有节点上都要执行】

4. 修改主机hosts文件

# vi /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.51.140    server0
192.168.51.141    server1
192.168.51.142    server2
192.168.51.143    server3
192.168.51.144    server4
192.168.51.145    server5
192.168.51.146    server6

5. 安装依赖的软件包

openssh-clients, flex, bison, readline-devel, zlib-devel, openjade, docbook-style-dsssl, gcc
用rpm命令查看是否已安装

# rpm -qa |grep xxx

如果没装可以用yum安装

6. 卸载操作系统自带的PostgreSQL

# rpm -qa |grep postgresql
# rpm -qa |grep postgresql | xargs rpm -e --nodeps
# rpm -qa |grep postgresql

7. 添加postgres用户

建立组：
# groupadd  postgres
建立用户：
# useradd  -m  -d   /home/postgres   postgres  -g  postgres
初始化密码：
# passwd postgres
输入密码：12345678（举例）
注：如果需要删除postgres用户，可以以root用户执行命令：
# userdel -r postgres

8. 配置免密登陆

root用户

# ssh-keygen
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@server0
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@server1
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@server2
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@server3
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@server4
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@server5
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@server6

postgres用户

$ ssh-keygen
$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@server0
$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@server1
$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@server2
$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@server3
$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@server4
$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@server5
$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@server6

如果配置完成后，发现ssh并不免密，可按下面操作进行完善：

chmod 755 /home/postgres
chmod 700 /home/postgres/.ssh
chmod 644 /home/postgres/.ssh/authorized_keys

9. 修改内核参数

在/etc/sysctl.conf中添加参数kernel.sem

# vi /etc/sysctl.conf
kernel.sem = 50100 128256000 50100 2560

使参数生效

# sysctl -p

验证参数是否生效。

# ipcs -ls

—— Semaphore Limits ——–
max number of arrays = 2560
max semaphores per array = 50100
max semaphores system wide = 128256000
max ops per semop call = 50100
semaphore max value = 32767

10. 配置防火墙

关闭防火墙

# systemctl stop firewalld
# systemctl disable firewalld

或在防火墙中开放端口
遇到防护墙不能关闭的情况，可使用该步骤！
具体需要开放的端口参加（主机规划部分），以5432为例：
使用这些命令来永久打开一个新端口（如TCP/5432）。

# sudo firewall-cmd --zone=public --add-port=5432/tcp --permanent
# sudo firewall-cmd --reload

注：查看防火墙状态命令：systemctl status firewalld
注：开启防火墙：systemctl start firewalld
注：关闭防火墙：systemctl stop firewalld

11. 关闭SELinux

查看SELinux状态，执行：

# /usr/sbin/sestatus -v

如果SELinux status参数为enabled即为开启状态
永久关闭SELinux，执行：

# vi /etc/selinux/config

将 SELINUX=enforcing 改为 SELINUX=disabled
重启后生效

12. 安装Postgres-XL软件

解压

# tar -jxvf postgres-xl-9.5r1.6.tar.bz2
# chown -R postgres:postgres postgres-xl-9.5r1.6

切换用户

# su - postgres

安装

$ cd postgres-xl-9.5r1.6
$ ./configure --prefix=/home/postgres/pgxl9.5
$ make
$ make install

安装扩展

$ cd contrib
$ make
$ make install

13. 配置环境变量

在文件.bash_profile中添加如下内容：

# su - postgres
$ vi .bash_profile

export PGHOME=/home/postgres/pgxl9.5
export PGUSER=postgres
export LD_LIBRARY_PATH=$PGHOME/lib:$LD_LIBRARY_PATH
export PATH=$PGHOME/bin:$PATH

使环境变量生效：

$ source ~/.bashrc

验证环境变量是否生效：

pg_ctl --version

显示：pg_ctl (PostgreSQL) 9.5.8 (Postgres-XL 9.5r1.6)

14. 配置Postgres-XL集群

在GTM节点上以postgres用户执行：

$ pgxc_ctl              ---初次执行，会提示Error说没有配置文件，忽略即可
PGXC prepare            ---执行该命令将会生成一份配置文件模板
PGXC exit               --退出 pgxc_ctl交互窗

执行完成后，在postgres用户根目录下，会生成一个pgxc_ctl目录，编辑其中的pgxc_ctl.conf文件

$ vi pgxc_ctl.conf

修改为：

#!/usr/bin/env bash

# pgxcInstallDir variable is needed if you invoke "deploy" command from pgxc_ctl utility.
# If don't you don't need this variable.
pgxcInstallDir=$PGHOME
pgxlDATA=$PGHOME/data
#---- OVERALL -----------------------------------------------------------------------------
#
pgxcOwner=postgres      # owner of the Postgres-XC databaseo cluster.  Here, we use this
                        # both as linus user and database user.  This must be
                        # the super user of each coordinator and datanode.
pgxcUser=$pgxcOwner        # OS user of Postgres-XC owner

tmpDir=/tmp                 # temporary dir used in XC servers
localTmpDir=$tmpDir            # temporary dir used here locally

configBackup=n                  # If you want config file backup, specify y to this value.
configBackupHost=pgxc-linker    # host to backup config file
configBackupDir=$HOME/pgxc     # Backup directory
configBackupFile=pgxc_ctl.bak   # Backup file name --> Need to synchronize when original changed.

#---- GTM ------------------------------------------------------------------------------------

#---- GTM Master -----------------------------------------------

#---- Overall ----
gtmName=gtm
gtmMasterServer=server0
gtmMasterPort=6666
gtmMasterDir=$pgxlDATA/nodes/gtm

#---- Configuration ---
gtmExtraConfig=none         # Will be added gtm.conf for both Master and Slave (done at initilization only)
gtmMasterSpecificExtraConfig=none   # Will be added to Master's gtm.conf (done at initialization only)

#---- Coordinators ----------------------------------------------------------------------------------------------------

#---- shortcuts ----------
coordMasterDir=$pgxlDATA/nodes/coord
coordSlaveDir=$pgxlDATA/nodes/coord_slave
coordArchLogDir=$pgxlDATA/nodes/coord_archlog

#---- Overall ------------
coordNames=(coord1)     # Master and slave use the same name
coordPorts=(5432)           # Master ports
poolerPorts=(20004)         # Master pooler ports
coordPgHbaEntries=(0.0.0.0/0)               # Assumes that all the coordinator (master/slave) accepts
                                                # the same connection
                                                # This entry allows only $pgxcOwner to connect.
                                                # If you'd like to setup another connection, you should
                                                # supply these entries through files specified below.

#---- Master -------------
coordMasterServers=server1      # none means this master is not available
coordMasterDirs=$coordMasterDir
coordMaxWALsernder=10   # max_wal_senders: needed to configure slave. If zero value is specified,
                        # it is expected to supply this parameter explicitly by external files
                        # specified in the following.   If you don't configure slaves, leave this value to zero.
coordMaxWALSenders=$coordMaxWALsernder
                        # max_wal_senders configuration for each coordinator.

#---- Configuration files---
coordExtraConfig=coordExtraConfig   # Extra configuration file for coordinators.  
                        # This file will be added to all the coordinators'
                        # postgresql.conf
# Pleae note that the following sets up minimum parameters which you may want to change.
# You can put your postgresql.conf lines here.
cat > $coordExtraConfig <<EOF
#================================================
# Added to all the coordinator postgresql.conf
# Original: $coordExtraConfig
log_destination = 'stderr'
logging_collector = on
log_directory = 'pg_log'
listen_addresses = '*'
max_connections = 512
EOF

# Additional Configuration file for specific coordinator master.
# You can define each setting by similar means as above.
coordSpecificExtraConfig=(none none)
coordExtraPgHba=none    # Extra entry for pg_hba.conf.  This file will be added to all the coordinators' pg_hba.conf
coordSpecificExtraPgHba=(none none)

#---- Datanodes -------------------------------------------------------------------------------------------------------

#---- Shortcuts --------------
datanodeMasterDir=$pgxlDATA/nodes/dn_master
datanodeSlaveDir=$pgxlDATA/nodes/dn_slave
datanodeArchLogDir=$pgxlDATA/nodes/datanode_archlog

#---- Overall ---------------
#primaryDatanode=datanode1              # Primary Node.
# At present, xc has a priblem to issue ALTER NODE against the primay node.  Until it is fixed, the test will be done
# without this feature.
primaryDatanode=datanode1               # Primary Node.
datanodeNames=(datanode1 datanode2)
datanodePorts=(5433 5433)   # Master ports
datanodePoolerPorts=(20005 20005)   # Master pooler ports
datanodePgHbaEntries=(0.0.0.0/0)    # Assumes that all the coordinator (master/slave) accepts
                                        # the same connection
                                        # This list sets up pg_hba.conf for $pgxcOwner user.
                                        # If you'd like to setup other entries, supply them
                                        # through extra configuration files specified below.
# Note: The above parameter is extracted as "host all all 0.0.0.0/0 trust".   If you don't want
# such setups, specify the value () to this variable and suplly what you want using datanodeExtraPgHba
# and/or datanodeSpecificExtraPgHba variables.
#datanodePgHbaEntries=(::1/128) # Same as above but for IPv6 addresses

#---- Master ----------------
datanodeMasterServers=(server2 server3) # none means this master is not available.
                                                    # This means that there should be the master but is down.
                                                    # The cluster is not operational until the master is
                                                    # recovered and ready to run.   
datanodeMasterDirs=($datanodeMasterDir/dn1 $datanodeMasterDir/dn2)
datanodeMaxWalSender=10                             # max_wal_senders: needed to configure slave. If zero value is 
                                                    # specified, it is expected this parameter is explicitly supplied
                                                    # by external configuration files.
                                                    # If you don't configure slaves, leave this value zero.
datanodeMaxWALSenders=($datanodeMaxWalSender $datanodeMaxWalSender)
                        # max_wal_senders configuration for each datanode

#---- Slave -----------------
datanodeSlave=y         # Specify y if you configure at least one coordiantor slave.  Otherwise, the following
                        # configuration parameters will be set to empty values.
                        # If no effective server names are found (that is, every servers are specified as none),
                        # then datanodeSlave value will be set to n and all the following values will be set to
                        # empty values.
datanodeSlaveServers=(server4 server5)  # value none means this slave is not available
datanodeSlavePorts=(15433 15433)    # value none means this slave is not available
datanodeSlavePoolerPorts=(20015 20015)  # value none means this slave is not available
datanodeSlaveSync=y     # If datanode slave is connected in synchronized mode
datanodeSlaveDirs=($datanodeSlaveDir $datanodeSlaveDir)
datanodeArchLogDirs=( $datanodeArchLogDir $datanodeArchLogDir)

# ---- Configuration files ---
# You may supply your bash script to setup extra config lines and extra pg_hba.conf entries here.
# These files will go to corresponding files for the master.
# Or you may supply these files manually.
datanodeExtraConfig=none    # Extra configuration file for datanodes.  This file will be added to all the 
                            # datanodes' postgresql.conf
datanodeSpecificExtraConfig=(none none)
datanodeExtraPgHba=none     # Extra entry for pg_hba.conf.  This file will be added to all the datanodes' postgresql.conf
datanodeSpecificExtraPgHba=(none none)

15.初始化集群

在GTM节点上以postgres用户执行：
初始化集群：

$ pgxc_ctl  -c  ~/pgxc_ctl/pgxc_ctl.conf init all

启动集群：

$ pgxc_ctl  -c /home/postgres/pgxc_ctl/pgxc_ctl.conf start all

停止集群

$ pgxc_ctl -c /home/postgres/pgxc_ctl/pgxc_ctl.conf stop all

16.删除集群

在所有节点上以postgres用户执行：

$ rm /home/postgres/pgxl9.5/data/nodes

皓月如我

关注

2
点赞
踩
3

收藏

觉得还不错? 一键收藏
打赏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录