从头搭建presto+kudu+hive+hdfs(6节点)

记录一下最近做的一个大数据集群安装:
安装版本:presto317、kudu1.10、hive3.1.2、hadoop3.1.2
几个主要软件的下载地址:
presto https://prestosql.io/docs/current/index.html
kudu rpm包地址 https://github.com/MartinWeindel/kudu-rpm/releases
hive http://mirror.bit.edu.cn/apache/hive/
hdfs http://archive.apache.org/dist/hadoop/core/
机器分布:
在这里插入图片描述
以下除非特殊说明,否则都是在所有主机同步操作,批量化操作方法看个人喜欢,本人用的saltstack。
1、准备工作
关闭selinux、防火墙、安装基础软件工具、安装JDK(jdk1.8.0_211)、所有主机配置好SSH免密登录、配置hosts文件(127.0.0.1的2行删除)

192.168.86.101 centos101 master
192.168.86.102 centos102 slave1
192.168.86.103 centos103 slave2
192.168.86.104 centos104 slave3
192.168.86.105 centos105 slave4
192.168.86.106 centos106 slave5

将主要程序的安装包下载解压到/data目录,如下:

[root@centos102 ~]# ll /data
total 765476
drwxr-xr-x 12 root root       183 Sep 14 16:29 hadoop-3.1.2
drwxr-xr-x  7   10  143       245 Apr  2 11:51 jdk1.8.0_211
drwxr-xr-x  4 kudu kudu        35 Sep 14 22:24 kudu
drwxr-xr-x  3 root root        42 Sep 15 10:49 presto
drwxr-xr-x  6 root root        85 Sep 14 23:19 presto-server-317

其中/data/presto是presto的数据目录,创建一下

mkdir -p /data/presto/

配置环境变量,以下加到/etc/profile文件里

JAVA_HOME=/data/jdk1.8.0_211
JRE_HOME=/data/jdk1.8.0_211/jre
CLASS_PATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib
export JAVA_HOME JRE_HOME CLASS_PATH PATH
export JAVA_HOME=/data/jdk1.8.0_211  
export HADOOP_HOME=/data/hadoop-3.1.2
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
 #以下看情况添加
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin

2、安装hadoop

vi /data/hadoop-3.1.2/etc/hadoop/core-site.xml 增加以下内容
    <configuration>
    <!-- master主机名 -->
      <property>
        <name>fs.defaultFS</name>
        <value>hdfs://master:8020</value>      文件系统默认8020端口,hbase.rootdir同
      </property>
    <!-- Size of read/write buffer used in SequenceFiles. -->
      <property>
        <name>io.file.buffer.size</name>
        <value>131072</value>
      </property>
        <property>
            <name>hadoop.tmp.dir</name>
            <value>/data/hadoop-3.1.2/tmp</value>
        </property>
    <property>
        <name>hadoop.proxyuser.root.hosts</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.root.groups</name>
        <value>*</value>
    </property>
    </configuration>
    
vi /data/hadoop-3.1.2/etc/hadoop/hadoop-env.sh 增加以下内容
export JAVA_HOME=/data/jdk1.8.0_211
export HADOOP_HOME=/data/hadoop-3.1.2
export PATH=$PATH:/data/hadoop-3.1.2/bin
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib:$HADOOP_COMMON_LIB_NATIVE_DIR"

 vi /data/hadoop-3.1.2/etc/hadoop/hdfs-site..xml 增加以下内容    
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
	</property>
	<property>
		<name>dfs.datanode.data.dir</name>
		<value>/data/hadoop-3.1.2/dfs/data</value>
	</property>
	<property>
		<name>dfs.namenode.name.dir</name>
		<value>/data/hadoop-3.1.2/dfs/name</value>
	</property>
    <property>
       <name>dfs.http.address</name>
       <value>0.0.0.0:50070</value>
    </property>
</configuration>

  vi /data/hadoop-3.1.2/etc/hadoop/mapred-site.xml 增加以下内容
    <configuration>
    <property>
	<name>mapreduce.framework.name</name>
	<value>yarn</value>


     </property>
    </configuration>

 vi /data/hadoop-3.1.2/etc/hadoop/workers 增加以下内容
    slave1 192.168.86.102
    slave2 192.168.86.103
    slave3 192.168.86.104
    slave4 192.168.86.105
    slave5 192.168.86.106

     vi /data/hadoop-3.1.2/etc/hadoop/yarn-site.xml 增加以下内容
    <configuration>    
    <!-- Site specific YARN configuration properties -->
            <property>
                    <name>yarn.nodemanager.aux-services</name>
                    <value>mapreduce_shuffle</value>
        </property>
        <property>             						 
         		<name>yarn.nodemanager.vmem-check-enabled</name>
                <value>false</value>
        </property>
        <!-- 针对mapreduce报错的配置 -->
        <property>
                <name>mapreduce.application.classpath</name>
                <value>/data/hadoop-3.1.2/share/hadoop/mapreduce/*,/data/hadoop-3.1.2/share/hadoop/mapreduce/lib/*</value>
        </property>
<property>
                <name>yarn.resourcemanager.hostname</name>
                <value>master</value>
        </property>
</configuration>

hadoop一共6个配置文件,另外还有4个启停脚本需要修改:
/data/hadoop-3.1.2/sbin/start-dfs.sh、stop-dfs.sh 文件头部增加以下内容

HDFS_DATANODE_USER=root
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
HDFS_DATANODE_SECURE_USER=hdfs

data/hadoop-3.1.2/sbin/start-yarn.sh、stop-yarn.sh 文件头部增加以下内容

YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root

配置好之后,所有服务器新建用户:

useradd -m hadoop -G root -s /bin/bash
useradd -m hdfs -G root -s /bin/bash
useradd -m yarn -G root -s /bin/bash

所有服务器初始格式化

hadoop namenode -format

如果没报错,hdfs master主节点 centos101执行
/data/hadoop-3.1.2/sbin/start-all.sh
如果没报错,所有服务器jps检查进程情况,正常应该是和表格中规划一致。

启动如果提示权限问题,将bin、sbin目录文件给执行权限再试一下:

chmod +x /data/hadoop-3.1.2/bin/*
chmod +x /data/hadoop-3.1.2/sbin/*

3、centos101主节点安装hive

cd /data/hive-3.1.2/conf/
cp hive-default.xml.template hive-site.xml 从模板复制一个hive配置文件
   vi /data/hive-3.1.2/conf/hive-site.xml 增加以下内容
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/user/hive/warehouse</value>
    </property>
    <property>
        <name>hive.exec.mode.local.auto</name>
        <value>true</value>
        <description> Let Hive determine whether to run in local mode automatically </description>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>hive</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>111111</value>
    </property>
    <!-- 显示表的列名 -->
    <property>
      <name>hive.cli.print.header</name>
      <value>true</value>
    </property>
    
    <!-- 显示数据库名称 -->
    <property>
      <name>hive.cli.print.current.db</name>
      <value>true</value>
    </property>
<property>
    <name>hive.server2.authentication</name>
    <value>NONE</value>
</property>
</configuration>

cd /data/hive-3.1.2/conf/
cp hive-env.sh.template hive-env.sh

vi /data/hive-3.1.2/conf/hive-env.sh增加以下配置项
HADOOP_HOME=/data/hadoop-3.1.2
export HIVE_CONF_DIR=/data/hive-3.1.2/conf
export HIVE_AUX_JARS_PATH=/data/hive-3.1.2/lib

将mysql驱动文件放到hive lib目录下

cp mysql-connector-java-5.1.46.jar /data/hive-3.1.2/lib

centos101主节点安装mysql

yum install -y mariadb-server
systemctl start mariadb
systemctl enable mariadb
初始化mysql
mysql_secure_installation
创建hive元数据库
create database hive character set utf8 ;  
CREATE USER 'hive'@'%'IDENTIFIED BY '111111';
GRANT ALL PRIVILEGES ON *.* TO 'hive'@'%';
FLUSH PRIVILEGES;

hdfs中创建hive目录和临时目录

hadoop fs -mkdir -p /user/hive/warehouse
hadoop fs -chmod g+w /user/hive/warehouse
hadoop fs -mkdir -p /tmp
hadoop fs -chmod g+w /tmp

初始化hive元数据库

schematool -dbType mysql -initSchema

启动hive service

nohup sh bin/hive --service metastore -p 9083 &
nohup sh bin/hive --service hiveserver2 &

也可以到bin目录直接hive连接检查验证。

4、安装kudu

cd /data&&rpm -ivh kudu-1.10.0-1.x86_64.rpm

在centos101、centos102、centos103三台主机执行创建master数据目录命令:

mkdir -p /data/kudu/master/logs /data/kudu/master/wals /data/kudu/master/data

在所有节点执行创建tserver数据目录命令

mkdir -p /data/kudu/tserver/logs /data/kudu/tserver/data /data/kudu/tserver/wals
vi /etc/kudu/conf/master.gflagfile
## Comma-separated list of the RPC addresses belonging to all Masters in this cluster.
## NOTE: if not specified, configures a non-replicated Master.
#--master_addresses=
--master_addresses=centos101:7051,centos102:7051,centos103:7051
--log_dir=/data/kudu/master/logs
--fs_wal_dir=/data/kudu/master/wals
--fs_data_dirs=/data/kudu/master/data

vi /etc/kudu/conf/tserver.gflagfile
#Comma separated addresses of the masters which the tablet server should connect to.
--tserver_master_addrs=centos101:7051,centos102:7051,centos103:7051
--log_dir=/data/kudu/tserver/logs
--fs_wal_dir=/data/kudu/tserver/wals
--fs_data_dirs=/data/kudu/tserver/data

修改kudu目录权限
chown -R kudu:kudu /data/kudu

在centos101、centos102、centos103三台主机执行kudu master启动命令

systemctl start kudu-master

在所有节点执行kudu tserver启动命令

systemctl start kudu-tserver

5、安装presto

    mkdir -p /data/presto-server-317/etc/catalog
    cd /data/presto-server-317/etc
    主节点配置文件
    vi config.properties
    coordinator=true
    node-scheduler.include-coordinator=false
    http-server.http.port=18080
    discovery-server.enabled=true
    discovery.uri=http://192.168.86.101:18080
    query.max-memory=8GB
    query.max-memory-per-node=1GB
    query.max-run-time=600s
    从节点配置文件
    coordinator=false
    http-server.http.port=18080
    discovery.uri=http://192.168.86.101:18080
    query.max-memory=8GB
    query.max-memory-per-node=1GB

以下所有节点相同
vi jvm.config
-server
-Xmx20G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:OnOutOfMemoryError=kill -9 %p
-XX:+CMSClassUnloadingEnabled
-XX:+AggressiveOpts
-DHADOOP_USER_NAME=root

vi log.properties
com.facebook.presto=INFO

vi node.properties
node.environment=mycluster
#这里需要注意,node_id的101跟着主机名修改,在centos102主节,即改为102,不能重复
node.id=node_coordinator_101
node.data-dir=/data/presto/

cd /data/presto-server-317/etc/catalog    
vi hive.properties 
#这里需要注意,连接名称影响调用的驱动版本,目前必须是hive-hadoop2
connector.name=hive-hadoop2
##hive元数据库
hive.metastore.uri=thrift://centos101:9083
hive.config.resources=/data/hadoop-3.1.2/etc/hadoop/core-site.xml,/data/hadoop-3.1.2/etc/hadoop/hdfs-site.xml

vi kudu.properties 
connector.name=kudu
## List of Kudu master addresses, at least one is needed (comma separated)
## Supported formats: example.com, example.com:7051, 192.0.2.1, 192.0.2.1:7051,
##                    [2001:db8::1], [2001:db8::1]:7051, 2001:db8::1
kudu.client.master-addresses=centos101:7051,centos102:7051,centos103:7051
## Kudu does not support schemas, but the connector can emulate them optionally.
## By default, this feature is disabled, and all tables belong to the default schema.
## For more details see connector documentation.
kudu.schema-emulation.enabled=true
## Prefix to use for schema emulation (only relevant if `kudu.schema-emulation.enabled=true`)
## The standard prefix is `presto::`. Empty prefix is also supported.
## For more details see connector documentation.
#kudu.schema-emulation.prefix=
#######################
### Advanced Kudu Java client configuration
#######################
## Default timeout used for administrative operations (e.g. createTable, deleteTable, etc.)
#kudu.client.defaultAdminOperationTimeout = 30s
kudu.client.default-admin-operation-timeout = 60s
## Default timeout used for user operations
#kudu.client.defaultOperationTimeout = 30s
kudu.client.default-operation-timeout = 60s
## Default timeout to use when waiting on data from a socket
#kudu.client.defaultSocketReadTimeout = 30s
kudu.client.default-socket-read-timeout = 60s
## Disable Kudu client's collection of statistics.
#kudu.client.disableStatistics = false

启动presto服务
/data/presto-server-317/bin/launcher start

到这里安装完成,中间有报错的,找对应的日志排查一下,最后附上几个主要服务的web页面地址:
hdfs http://192.168.86.101:50070
yarn http://192.168.86.101:8088
kudu http://192.168.86.101:8051
presto http://192.168.86.101:18080

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 8
    评论
评论 8
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值