Greenplum 6安装指南(CentOS 7.X)

本文提供了一份详尽的Greenplum 6在CentOS 7系统的安装指南,包括安装前的系统配置、集群安装部署步骤及集群初始化等关键环节,确保顺利搭建分布式数据库。
摘要由CSDN通过智能技术生成

一、基本概念

Greenplum是一个面向数据仓库应用的关系型数据库,因为有良好的体系结构,所以
在数据存储、高并发、高可用、线性扩展、反应速度、易用性和性价比等方面有非常
明显的优势。Greenplum是一种基于PostgreSQL的分布式数据库,其采用sharednothing架构,主机、操作系统、内存、存储都是自我控制的,不存在共享。
本质上讲Greenplum是一个关系型数据库集群,它实际上是由数个独立的数据库服务
组合成的逻辑数据库。与RAC不同,这种数据库集群采取的是MPP(Massively
Parallel Processing)架构。跟MySQL、Oracle 等关系型数据不同,Greenplum可以理解为分布式关系型数据库。
关于Greenplum的更多信息请访问https://greenplum.org/

二、安装准备

1.下载离线安装包

https://github.com/greenplum-db/gpdb/releases/tag/6.1.0

在这里插入图片描述

2.上传到服务器,放在/home/softs下(自定义目录)

3.关闭防火墙

- 关闭防火墙(所有机器)
  iptables  (centos6.x)
  关闭:service iptables stop
  永久关闭:chkconfig iptables off
- 检查firewalld (centos 7.x)
  关闭:systemctl stop firewalld
  永久关闭 :systemctl disable firewalld

4.关闭SELINUX(所有机器)

[root@mdw ~]# vi /etc/selinux/config
确保SELINUX=disabled

5.配置/etc/hosts (所有机器)

为之后GP在各个节点之间相互通信做准备。
修改各台主机的主机名称。 一般建议的命名规则如下: 项目名_gp_节点
      Master :  dis_gp_mdw
      Standby Master :  dis_gp_smdw
      Segment Host :  dis_gp_sdw1  dis_gp_sdw2  以此类推
      如果Standby也搭建在某Segment host下,则命名为:dis_gp_sdw3_smdw

[root@mdw ~]# vi /etc/hosts

添加每台机器的ip 和hostname,确保所有机器的/etc/hosts中包含以下信息:
192.168.xxx.xxx gp-mdw
192.168.xxx.xxx gp-sdw1
192.168.xxx.xxx gp-sdw2
192.168.xxx.xxx gp-sdw3-mdw2

6.修改主机名

Centos7.x    vi /etc/hostname
Centos6.x    vi /etc/sysconfig/network
修改完之后   reboot机器

7.配置sysctl.conf(所有机器)

vi /etc/sysctl.conf
kernel.shmall = 197951838 # echo $(expr $(getconf _PHYS_PAGES) / 2)
kernel.shmmax = 810810728448 # echo $(expr $(getconf _PHYS_PAGES) / 2 \* $(getconf PAGE_SIZE))
kernel.shmmni = 4096
vm.overcommit_memory = 2
vm.overcommit_ratio = 75 #vm.overcommit_ratio = (RAM - 0.026 * gp_vmem_rq) / RAM #gp_vmem_rq = ((SWAP + RAM) – (7.5GB + 0.05 * RAM)) / 1.7
net.ipv4.ip_local_port_range = 10000 65535
kernel.sem = 500 2048000 200 4096
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.swappiness = 10
vm.zone_reclaim_mode = 0
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
#vm.min_free_kbytes = 487119#awk 'BEGIN {OFMT = "%.0f";} /MemTotal/ {print "vm.min_free_kbytes =", $2 * .03;}' /proc/meminfo

#对于大于64GB的主机,加上下面4行
vm.dirty_background_ratio = 0
vm.dirty_ratio = 0
vm.dirty_background_bytes = 1610612736 # 1.5GB
vm.dirty_bytes = 4294967296 # 4GB
#对于小于64GB的主机删除dirty_background_bytes dirty_bytes,加上下面2行
vm.dirty_background_ratio = 3
vm.dirty_ratio = 10
#vm.min_free_kbytes在内存 > 64GB系统的时候可以设置,一般比较少设置此参数。
#vm.min_free_kbytes,确保网络和存储驱动程序PF_MEMALLOC得到分配。这对内存大的系统尤其重要。一般系统上,默认值通常太低。可以使用awk命令计算vm.min_free_kbytes的值,通常是建议的系统物理内存的3%:
#awk 'BEGIN {OFMT = "%.0f";} /MemTotal/ {print "vm.min_free_kbytes =", $2 * .03;}' /proc/meminfo >> /etc/sysctl.conf 
#不要设置 vm.min_free_kbytes 超过系统内存的5%,这样做可能会导致内存不足。
file-max:这个参数表示进程可以同时打开的最大句柄数,这个参数直接限制最大并发连接数。
tcp_tw_reuse:这个参数设置为1,表示允许将TIME-WAIT状态的socket重新用于新的TCP链接。这个对服务器来说很有意义,因为服务器上总会有大量TIME-WAIT状态的连接。
tcp_keepalive_time:这个参数表示当keepalive启用时,TCP发送keepalive消息的频度。默认是7200 seconds,意思是如果某个TCP连接在idle 2小时后,内核才发起probe。若将其设置得小一点,可以更快地清理无效的连接。
tcp_fin_timeout:这个参数表示当服务器主动关闭连接时,socket保持在FIN-WAIT-2状态的最大时间。
tcp_max_tw_buckets:这个参数表示操作系统允许TIME_WAIT套接字数量的最大值,如果超过这个数字,TIME_WAIT套接字将立刻被清除并打印警告信息。默认是i180000,过多TIME_WAIT套接字会使Web服务器变慢。
tcp_max_syn_backlog:这个参数表示TCP三次握手建立阶段接受WYN请求队列的最大长度,默认1024,将其设置大一些可以使出现Nginx繁忙来不及accept新连接的情况时,Linux不至于丢失客户端发起的连接请求。
ip_local_port_range:这个参数定义了在UDP和TCP连接中本地端口的取值范围。
net.ipv4.tcp_rmem:这个参数定义了TCP接受缓存(用于TCP接收滑动窗口)的最小值,默认值,最大值。
net.ipv4.tcp_wmem:这个参数定义了TCP发送缓存(用于TCP发送滑动窗口)的最小值,默认值,最大值。
netdev_max_backlog:当网卡接收数据包的速度大于内核处理的速度时,会有一个队列保存这些数据包。这个参数表示该队列的最大值。
rmem_default:这个参数表示内核套接字接收缓存区默认的大小。
wmem_default:这个参数表示内核套接字发送缓存区默认的大小。
rmem_max:这个参数表示内核套接字接收缓存区默认的最大大小。
wmem_max:这个参数表示内核套接字发送缓存区默认的最大大小。

8.配置资源限制参数(所有机器)

在/etc/security/limits.conf文件下增加以下参数

vi /etc/security/limits.conf
* soft nofile 524288
* hard nofile 524288
* soft nproc 131072
* hard nproc 131072
“*” 星号表示所有用户
noproc     是代表最大进程数 
nofile     是代表最大文件打开数
RHEL / CentOS 6.X
修改:/etc/security/limits.d/90-nproc.conf 文件的nproc
[root@mdw ~]# vi /etc/security/limits.d/90-nproc.conf
           确保    *  soft  nproc  131072
RHEL / CentOS 7.X 修改:
修改:/etc/security/limits.d/20-nproc.conf 文件的nproc
[root@mdw 
greenplum-db-6.2.1-rhel7-x86_64.rpm Pivotal Greenplum 6.2 Release Notes This document contains pertinent release information about Pivotal Greenplum Database 6.2 releases. For previous versions of the release notes for Greenplum Database, go to Pivotal Greenplum Database Documentation. For information about Greenplum Database end of life, see Pivotal Greenplum Database end of life policy. Pivotal Greenplum 6 software is available for download from the Pivotal Greenplum page on Pivotal Network. Pivotal Greenplum 6 is based on the open source Greenplum Database project code. Important: Pivotal Support does not provide support for open source versions of Greenplum Database. Only Pivotal Greenplum Database is supported by Pivotal Support. Release 6.2.1 Release Date: 2019-12-12 Pivotal Greenplum 6.2.1 is a minor release that includes new features and resolves several issues. New Features Greenplum Database 6.2.1 includes these new features: Greenplum Database supports materialized views. Materialized views are similar to views. A materialized view enables you to save a frequently used or complex query, then access the query results in a SELECT statement as if they were a table. Materialized views persist the query results in a table-like form. Materialized view data cannot be directly updated. To refresh the materialized view data, use the REFRESH MATERIALIZED VIEW command. See Creating and Managing Materialized Views. Note: Known Issues and Limitations describes a limitation of materialized view support in Greenplum 6.2.1. The gpinitsystem utility supports the --ignore-warnings option. The option controls the value returned by gpinitsystem when warnings or an error occurs. If you specify this option, gpinitsystem returns 0 if warnings occurred during system initialization, and returns a non-zero value if a fatal error occurs. If this option is not specified, gpinitsystem returns 1 if initialization completes with warnings, and returns value of 2 or greater if a fatal error occurs. PXF version 5.10.0 is included, which introduces several new and changed features and bug fixes. See PXF Version 5.10.0 below. PXF Version 5.10.0 PXF 5.10.0 includes the following new and changed features: PXF has improved its performance when reading a large number of files from HDFS or an object store. PXF bundles newer tomcat and jackson libraries. The PXF JDBC Connector now supports pushdown of OR and NOT logical filter operators when specified in a JDBC named query or in an external table query filter condition. PXF supports writing Avro-format data to Hadoop and object stores. Refer to Reading and Writing HDFS Avro Data for more information about this feature. PXF is now certified with Hadoop 2.x and 3.1.x and Hive Server 2.x and 3.1, and bundles new and upgraded Hadoop libraries to support these versions. PXF supports Kerberos authentication to Hive Server 2.x and 3.1.x. PXF supports per-server user impersonation configuration. PXF supports concurrent access to multiple Kerberized Hadoop clusters. In previous releases of Greenplum Database, PXF supported accessing a single Hadoop cluster secured with Kerberos, and this Hadoop cluster must have been configured as the default PXF server. PXF introduces a new template file, pxf-site.xml, to specify the Kerberos and impersonation property settings for a Hadoop or JDBC server configuration. Refer to About Kerberos and User Impersonation Configuration (pxf-site.xml) for more information about this file. PXF now supports connecting to Hadoop with a configurable Hadoop user identity. PXF previously supported only proxy access to Hadoop via the gpadmin Greenplum user. PXF version 5.10.0 deprecates the following configuration properties. Note: These property settings continue to work. The PXF_USER_IMPERSONATION, PXF_PRINCIPAL, and PXF_KEYTAB settings in the pxf-env.sh file. You can use the pxf-site.xml file to configure Kerberos and impersonation settings for your new Hadoop server configurations. The pxf.impersonation.jdbc property setting in the jdbc-site.xml file. You can use the pxf.service.user.impersonation property to configure user impersonation for a new JDBC server configuration. Note: If you have previously configured a PXF JDBC server to access Kerberos-secured Hive, you must upgrade the server definition. See Upgrading PXF in Greenplum 6.x for more information. Changed Features Greenplum Database 6.2.1 includes these changed features: Greenplum Stream Server version 1.3.1 is included in the Greenplum distribution. Resolved Issues Pivotal Greenplum 6.2.1 is a minor release that resolves these issues: 29454 - gpstart During Greenplum Database start up, the gpstart utility did not report when a segment instance failed to start. The utility always displayed 0 skipped segment starts. This issue has been resolved. gpstart output was also enhanced to provide additional warnings and summary information about the number of skipped segments. For example: [WARNING]:-********
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值