1、环境Centos7
2、大数据组件CDH5.14.2,确保安装了HDFS和YARN(YARN可以不装)
3、jdk1.8
4、HAWQ2.3.0安装方式RPM安装
5、HAWQ 下载地址
(如何安装cdh可以查看我的博客或简书 https://www.jianshu.com/u/63848eb4cd0a)
6、HAWQ2.30 所需系统环境设置(全部服务器节点 )修改/etc/sysctl.conf:
vi /etc/sysctl.conf
添加以下配置
kernel.shmmax= 1000000000
kernel.shmmni= 4096
kernel.shmall= 4000000000
kernel.sem= 250 512000 100 2048
kernel.sysrq= 1
kernel.core_uses_pid= 1
kernel.msgmnb= 65536
kernel.msgmax= 65536
kernel.msgmni= 2048
net.ipv4.tcp_syncookies= 0
net.ipv4.ip_forward= 0
net.ipv4.conf.default.accept_source_route= 0
net.ipv4.tcp_tw_recycle= 1
net.ipv4.tcp_max_syn_backlog= 200000
net.ipv4.conf.all.arp_filter= 1
net.ipv4.ip_local_port_range= 1281 65535
net.core.netdev_max_backlog= 200000
vm.overcommit_memory= 2
fs.nr_open= 3000000
kernel.threads-max= 798720
kernel.pid_max= 798720
#increase network
net.core.rmem_max=2097152
net.core.wmem_max=2097152
保存退出后,使用命令“sysctl -p” 使之生效:
sysctl -p
7、修改
vi /etc/security/limits.conf
添加或修改以下配置
* soft nofile 2900000
* hard nofile 2900000
* soft nproc 131072
* hard nproc 131072
8、 添加gpadmin用户(greeplum admin缩写),使用/opt/gpadmin作为主目录
useradd --home=/opt/gpadmin/ --no-create-home --comment "HAWQ admin" gpadmin
echo YOURPASSWORD | passwd --stdin gpadmin
mkdir /opt/gpadmin
chown gpadmin:gpadmin /opt/gpadmin
9、 添加gpadmin到/etc/sudoers
vi /etc/sudoers 编辑添加如下行
gpadmin ALL=(ALL) NOPASSWD:ALL
10、安装依赖包
wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
rpm -ivh epel-release-latest-7.noarch.rpm
yum makecache
yum install -y man passwd sudo tar which git mlocate links make bzip2 net-tools \
autoconf automake libtool m4 gcc gcc-c++ gdb bison flex gperf maven indent \
libuuid-devel krb5-devel libgsasl-devel expat-devel libxml2-devel \
perl-ExtUtils-Embed pam-devel python-devel libcurl-devel snappy-devel \
thrift-devel libyaml-devel libevent-devel bzip2-devel openssl-devel \
openldap-devel protobuf-devel readline-devel net-snmp-devel apr-devel \
libesmtp-devel python-pip json-c-devel \
lcov cmake3 \
openssh-clients openssh-server perl-JSON perl-Env
11、开始安装HAWQ集群(全部服务器)
解压下载好的HAWQ
tar -zxvf apache-hawq-rpm-2.3.0.0-incubating.tar.gz
进入解压目录下的hawq_rpm_packages执行RPM安装hawq
cd hawq_rpm_packages/
rpm -ivh apache-hawq-2.3.0.0-el7.x86_64.rpm
安装完成后 进入安装后的目录/usr/local/apache-hawq
cd /usr/local/apache-hawq
chown gpadmin:gpadmin /usr/local/apache-hawq
切换到用户hdfs,在HDFS创建HAWQ所需文件夹,并改变文件夹的所有者
su hdfs
hdfs dfs -mkdir /hawq_default
hdfs dfs -chown gpadmin:gpadmin /hawq_default
切换到gpadmin用户
su gpadmin
cd /usr/local/apache-hawq/etc/
vi hawq-site.xml
修改以下配置
<property>
<name>hawq_master_address_host</name>
<value>192.168.32.139</value>
<description>The host name of hawq master.</description>
</property>
<property>
<name>hawq_standby_address_host</name>
<value>192.168.32.138</value>
<description>The host name of hawq standby master.</description>
</property>
<property>
<name>hawq_dfs_url</name>
<value>192.168.32.139:8020/hawq_default</value>
<description>URL for accessing HDFS.</description>
</property>
<property>
<name>hawq_master_directory</name>
<value>/opt/gpadmin/hawq-data-directory/masterdd</value>
<description>The directory of hawq master.</description>
</property>
<property>
<name>hawq_segment_directory</name>
<value>/opt/gpadmin/hawq-data-directory/segmentdd</value>
<description>The directory of hawq segment.</description>
</property>
master 和 standby 装在 hadoop namenode 和secondnamenode 上, segmentdd 装在datanode所在服务器
在 hawq_master_address_host和hawq_standby_address_host 创建文件夹
/opt/gpadmin/hawq-data-directory/masterdd
(注意:确保/opt/gpadmin/hawq-data-directory/masterdd的所有者是gpadmin)
修改slaves,添加segment从服务器的ip,多个换行
并在从服务器上创建文件夹/opt/gpadmin/hawq-data-directory/segmentdd
(注意:确保/opt/gpadmin/hawq-data-directory/segmentdd的所有者是gpadmin)
12、配置gpadmin用户免密登录
cd /usr/local/apache-hawq/
source greenplum_path.sh
cd ./bin
./hawq ssh-exkeys -h 192.168.32.139 -h 192.168.32.138 -h 192.168.32.136 -h 192.168.32.134
按照提示输入密码(注意:需要在gpadmin的用户目录下创建 .ssh文件夹)
13. 初始化hawq cluster
./hawq init cluster
集群启动和停止命令
./hawq start cluster
./hawq stop cluster
14.添加访问ip 为trust
修改hawq_master_directory下面的pg_hba.conf
host all gpadmin 192.168.32.1/24 trust
15. 远程使用gpadmin用户访问postgreSQL数据库验证是否可连接