apache大数据各组件部署搭建(超级详细)

apache大数据数仓各组件部署搭建

第一章 环境准备

1. 机器规划

准备3台服务器用于集群部署,系统建议CentOS7+,2核8G内存

172.19.195.228 hadoop101
172.19.195.229 hadoop102
172.19.195.230 hadoop103

[root@hadoop101 ~]# cat /etc/redhat-release 
CentOS Linux release 7.5.1804 (Core) 
[root@hadoop101 ~]# hostname
hadoop101
2. 安装包下载准备

数仓部署组件安装包:
链接:https://pan.baidu.com/s/1Wjx6TNkedMTmmnuWREW-OQ
提取码:bpk0

已经把相关组件均上传至网盘,也可自行去各自官方地址去下载收集;

在这里插入图片描述

3. 配置服务器hosts

3台机器的/etc/hosts主机名解析配置:

# hadoop101
[root@hadoop101 ~]# cat /etc/hosts
127.0.0.1 localhost  localhost
::1	localhost	localhost.localdomain	localhost6	localhost6.localdomain6
# hadoop集群
172.19.195.228 hadoop101
172.19.195.229 hadoop102
172.19.195.230 hadoop103

# hadoop102
[root@hadoop102 ~]# cat /etc/hosts
127.0.0.1 localhost  localhost
::1	localhost	localhost.localdomain	localhost6	localhost6.localdomain6
# hadoop集群
172.19.195.228 hadoop101
172.19.195.229 hadoop102
172.19.195.230 hadoop103

# hadoop103
[root@hadoop103 ~]# cat /etc/hosts
127.0.0.1 localhost  localhost
::1	localhost	localhost.localdomain	localhost6	localhost6.localdomain6
# hadoop集群
172.19.195.228 hadoop101
172.19.195.229 hadoop102
172.19.195.230 hadoop103
4. 配置服务器之间免密登录
# 创建密钥
[root@hadoop101 ~]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:pgAtkJ9Tmf8sqBYOkK2gr/d7woIPXDguOiHRxRHDVH4 root@hadoop101
The key's randomart image is:
+---[RSA 2048]----+
|.. +=*.          |
|.. .Bo           |
| =o+... E        |
|= Bo  ..         |
|+= o.. oS        |
|O + ...oo        |
|+O +  ..         |
|=.B o .          |
|o*.oo+           |
+----[SHA256]-----+

# 分发密钥
[root@hadoop101 ~]# ssh-copy-id hadoop101
Are you sure you want to continue connecting (yes/no)? yes
root@hadoop101's password: 
Number of key(s) added: 1
Now try logging into the machine, with:   "ssh 'hadoop101'"
and check to make sure that only the key(s) you wanted were added.

[root@hadoop101 ~]# ssh-copy-id hadoop102
Are you sure you want to continue connecting (yes/no)? yes
root@hadoop102's password: 
Number of key(s) added: 1
Now try logging into the machine, with:   "ssh 'hadoop102'"
and check to make sure that only the key(s) you wanted were added.

[root@hadoop101 ~]# ssh-copy-id hadoop103
Are you sure you want to continue connecting (yes/no)? yes
root@hadoop103's password: 
Number of key(s) added: 1
Now try logging into the machine, with:   "ssh 'hadoop103'"
and check to make sure that only the key(s) you wanted were added.

# hadoop102和hadoop103操作相同
[root@hadoop102 ~]# ssh-keygen -t rsa
[root@hadoop102 ~]# ssh-copy-id hadoop101
[root@hadoop102 ~]# ssh-copy-id hadoop102
[root@hadoop102 ~]# ssh-copy-id hadoop103

[root@hadoop103 ~]# ssh-keygen -t rsa
[root@hadoop103 ~]# ssh-copy-id hadoop101
[root@hadoop103 ~]# ssh-copy-id hadoop102
[root@hadoop103 ~]# ssh-copy-id hadoop103

# 验证
[root@hadoop101 ~]# for i in hadoop101 hadoop102 hadoop103;do ssh $i hostname;done
hadoop101
hadoop102
hadoop103
5. 创建安装包和应用目录
[root@hadoop101 ~]# for i in hadoop101 hadoop102 hadoop103;do ssh $i mkdir /opt/{software,module};done
6. 上传安装包至服务器
# 上传所有安装包至hadoop101 /opt/software目录
[root@hadoop101 ~]# cd /opt/software/
[root@hadoop101 software]# ll
total 1489048
-rw-r--r-- 1 root root 278813748 Aug 26 15:13 apache-hive-3.1.2-bin.tar.gz
-rw-r--r-- 1 root root   9136463 Aug 26 15:13 apache-maven-3.6.1-bin.tar.gz
-rw-r--r-- 1 root root   9311744 Aug 26 15:13 apache-zookeeper-3.5.7-bin.tar.gz
-rw-r--r-- 1 root root 338075860 Aug 26 15:14 hadoop-3.1.3.tar.gz
-rw-r--r-- 1 root root 314030393 Aug 26 15:14 hue.tar
-rw-r--r-- 1 root root 194990602 Aug 26 15:15 jdk-8u211-linux-x64.tar.gz
-rw-r--r-- 1 root root  70057083 Aug 26 15:14 kafka_2.11-2.4.0.tgz
-rw-r--r-- 1 root root  77807942 Aug 26 15:15 mysql-libs.zip
-rw-r--r-- 1 root root 232530699 Aug 26 15:15 spark-2.4.5-bin-hadoop2.7.tgz
7. 各服务器关闭防火墙
# 防火墙关闭
[root@hadoop101 software]# systemctl stop firewalld.service
[root@hadoop101 software]# systemctl disable firewalld.service

[root@hadoop102 ~]# systemctl stop firewalld.service
[root@hadoop102 ~]# systemctl disable firewalld.service

[root@hadoop103 ~]# systemctl stop firewalld.service
[root@hadoop103 ~]# systemctl disable firewalld.service
8. 本地Windows电脑配置hosts( 可选 )

【注意】如果不配置则涉及URL访问时,浏览器访问时需要使用各服务器ip地址

C:\Windows\System32\drivers\etc\hosts

# hadoop集群
139.224.229.107 hadoop101
139.224.66.13 hadoop102
139.224.228.144 hadoop103
9. 安装java ( jdk1.8 )
# 解压包
[root@hadoop101 software]# tar -xf jdk-8u211-linux-x64.tar.gz -C /opt/module/
[root@hadoop101 software]# ls /opt/module/
jdk1.8.0_211

# 增加环境变量配置
[root@hadoop101 software]# vim /etc/profile
#JAVA_HOME
export JAVA_HOME=/opt/module/jdk1.8.0_211
export PATH=$PATH:$JAVA_HOME/bin

# 分发jdk目录至hadoop102、hadoop103
[root@hadoop101 software]# scp -r /opt/module/jdk1.8.0_211 hadoop102:/opt/module/
[root@hadoop101 software]# scp -r /opt/module/jdk1.8.0_211 hadoop103:/opt/module/

# 分发环境变量配置至hadoop102、hadoop103
[root@hadoop101 software]# scp /etc/profile hadoop102:/etc/
[root@hadoop101 software]# scp /etc/profile hadoop103:/etc/

# source应用环境变量hadoop101、hadoop102、hadoop103
[root@hadoop101 software]# source /etc/profile && java -version
java version "1.8.0_211"
Java(TM) SE Runtime Environment (build 1.8.0_211-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)

[root@hadoop102 ~]# source /etc/profile && java -version
java version "1.8.0_211"
Java(TM) SE Runtime Environment (build 1.8.0_211-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)

[root@hadoop103 ~]# source /etc/profile && java -version
java version "1.8.0_211"
Java(TM) SE Runtime Environment (build 1.8.0_211-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)

第二章 zookeeper安装部署

zookeeper完整详细的内容介绍可见zookeeper简介、zookeeper部署以及原理介绍

1. 解压zookeeper安装包
[root@hadoop101 software]# tar -xf apache-zookeeper-3.5.7-bin.tar.gz -C /opt/module/
2. 创建zkData目录
[root@hadoop101 software]# mkdir -p /opt/module/apache-zookeeper-3.5.7-bin/zkData
3. 设定节点myid号
[root@hadoop101 software]# echo 1 > /opt/module/apache-zookeeper-3.5.7-bin/zkData/myid
4. 修改zoo.cfg配置文件
[root@hadoop101 software]# cd /opt/module/apache-zookeeper-3.5.7-bin/conf/
[root@hadoop101 conf]# ll
total 12
-rw-r--r-- 1 502 games  535 May  4  2018 configuration.xsl
-rw-r--r-- 1 502 games 2712 Feb  7  2020 log4j.properties
-rw-r--r-- 1 502 games  922 Feb  7  2020 zoo_sample.cfg
[root@hadoop101 conf]# mv zoo_sample.cfg zoo.cfg
[root@hadoop101 conf]# vim zoo.cfg 
# 为方便查看,将注释行都删去,可保留
[root@hadoop101 conf]# cat zoo.cfg 
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/opt/module/apache-zookeeper-3.5.7-bin/zkData
clientPort=2181
server.1=hadoop101:2888:3888
server.2=hadoop102:2888:3888
server.3=hadoop103:2888:3888
[root@hadoop101 conf]# 
5. 分发应用目录
[root@hadoop101 module]# scp -r apache-zookeeper-3.5.7-bin hadoop102:/opt/module/
[root@hadoop101 module]# scp -r apache-zookeeper-3.5.7-bin hadoop103:/opt/module/
6. 更改其它节点myid号
[root@hadoop102 ~]# echo 2 > /opt/module/apache-zookeeper-3.5.7-bin/zkData/myid
[root@hadoop103 ~]# echo 3 > /opt/module/apache-zookeeper-3.5.7-bin/zkData/myid
7. 检查各节点myid号确保正确
[root@hadoop101 module]# for i in hadoop101 hadoop102 hadoop103;do ssh $i cat /opt/module/apache-zookeeper-3.5.7-bin/zkData/myid;done
1
2
3
8. 启动集群各节点服务
[root@hadoop101 module]# /opt/module/apache-zookeeper-3.5.7-bin/bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/module/apache-zookeeper-3.5.7-bin/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

[root@hadoop102 module]# /opt/module/apache-zookeeper-3.5.7-bin/bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/module/apache-zookeeper-3.5.7-bin/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

[root@hadoop103 ~]# /opt/module/apache-zookeeper-3.5.7-bin/bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/module/apache-zookeeper-3.5.7-bin/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
9. 查验服务是否运行
[root@hadoop101 module]# for i in hadoop101 hadoop102 hadoop103;do ssh $i $JAVA_HOME/bin/jps -l|grep -v jps;done
5856 org.apache.zookeeper.server.quorum.QuorumPeerMain
5747 org.apache.zookeeper.server.quorum.QuorumPeerMain
5754 org.apache.zookeeper.server.quorum.QuorumPeerMain

第三章 hadoop集群安装部署

hadoop完整详细的内容介绍可见hadoop介绍部署文档

1. 解压hadoop安装包
[root@hadoop101 software]# tar -xf hadoop-3.1.3.tar.gz -C /opt/module/
2. 配置hadoop环境变量文件
# 文件末尾增加如下配置
[root@hadoop101 software]# vim /etc/profile
#HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop-3.1.3
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
3. 各配置文件路径
[root@hadoop101 software]# cd /opt/module/hadoop-3.1.3/etc/hadoop/
[root@hadoop101 hadoop]# ll
total 176
-rw-r--r-- 1 1000 1000  8260 Sep 12  2019 capacity-scheduler.xml
-rw-r--r-- 1 1000 1000  1335 Sep 12  2019 configuration.xsl
-rw-r--r-- 1 1000 1000  1940 Sep 12  2019 container-executor.cfg
-rw-r--r-- 1 1000 1000  1353 Aug 26 16:29 core-site.xml
-rw-r--r-- 1 1000 1000  3999 Sep 12  2019 hadoop-env.cmd
-rw-r--r-- 1 1000 1000 15946 Aug 26 16:42 hadoop-env.sh
-rw-r--r-- 1 1000 1000  3323 Sep 12  2019 hadoop-metrics2.properties
-rw-r--r-- 1 1000 1000 11392 Sep 12  2019 hadoop-policy.xml
-rw-r--r-- 1 1000 1000  3414 Sep 12  2019 hadoop-user-functions.sh.example
-rw-r--r-- 1 1000 1000  2956 Aug 26 16:28 hdfs-site.xml
-rw-r--r-- 1 1000 1000  1484 Sep 12  2019 httpfs-env.sh
-rw-r--r-- 1 1000 1000  1657 Sep 12  2019 httpfs-log4j.properties
-rw-r--r-- 1 1000 1000    21 Sep 12  2019 httpfs-signature.secret
-rw-r--r-- 1 1000 1000   620 Sep 12  2019 httpfs-site.xml
-rw-r--r-- 1 1000 1000  3518 Sep 12  2019 kms-acls.xml
-rw-r--r-- 1 1000 1000  1351 Sep 12  2019 kms-env.sh
-rw-r--r-- 1 1000 1000  1747 Sep 12  2019 kms-log4j.properties
-rw-r--r-- 1 1000 1000   682 Sep 12  2019 kms-site.xml
-rw-r--r-- 1 1000 1000 13326 Sep 12  2019 log4j.properties
-rw-r--r-- 1 1000 1000   951 Sep 12  2019 mapred-env.cmd
-rw-r--r-- 1 1000 1000  1764 Sep 12  2019 mapred-env.sh
-rw-r--r-- 1 1000 1000  4113 Sep 12  2019 mapred-queues.xml.template
-rw-r--r-- 1 1000 1000   758 Sep 12  2019 mapred-site.xml
drwxr-xr-x 2 1000 1000  4096 Sep 12  2019 shellprofile.d
-rw-r--r-- 1 1000 1000  2316 Sep 12  2019 ssl-client.xml.example
-rw-r--r-- 1 1000 1000  2697 Sep 12  2019 ssl-server.xml.example
-rw-r--r-- 1 1000 1000  2642 Sep 12  2019 user_ec_policies.xml.template
-rw-r--r-- 1 1000 1000    30 Aug 26 16:33 workers
-rw-r--r-- 1 1000 1000  2250 Sep 12  2019 yarn-env.cmd
-rw-r--r-- 1 1000 1000  6056 Sep 12  2019 yarn-env.sh
-rw-r--r-- 1 1000 1000  2591 Sep 12  2019 yarnservice-log4j.properties
-rw-r--r-- 1 1000 1000  2029 Aug 26 16:32 yarn-site.xml
4. 修改hdfs-site配置文件
[root@hadoop101 hadoop]# vim hdfs-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
  <!--配置副本数-->
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
  <!--配置nameservice-->
  <property>
    <name>dfs.nameservices</name>
    <value>mycluster</value>
  </property>
  <!--配置多NamenNode-->
  <property>
    <name>dfs.ha.namenodes.mycluster</name>
    <value>nn1,nn2,nn3</value>
  </property>
  <property>
    <name>dfs.namenode.rpc-address.mycluster.nn1</name>
    <value>hadoop101:8020</value>
  </property>
  <property>
    <name>dfs.namenode.rpc-address.mycluster.nn2</name>
    <value>hadoop102:8020</value>
  </property>
  <property>
    <name>dfs.namenode.rpc-address.mycluster.nn3</name>
    <value>hadoop103:8020</value>
  </property>
  <!--为NamneNode设置HTTP服务监听-->
  <property>
    <name>dfs.namenode.http-address.mycluster.nn1</name>
    <value>hadoop101:9870</value>
  </property>
  <property>
    
  • 31
    点赞
  • 20
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
大数据平台自动化部署是一项相对复杂而且需要耗费一定时间的工作,但它能够大大提高大数据平台的部署效率和可靠性。下面是一个基于Hadoop生态的自动化部署平台的搭建步骤: 1. 确定部署环境 首先需要确定部署环境的配置,包括硬件配置、操作系统、网络等。一般来说,大数据平台需要较高的计算、存储和网络性能,推荐使用64位操作系统,至少8GB内存,至少2个CPU核心。 2. 安装Java Hadoop等大数据平台是基于Java开发的,因此需要安装Java运行环境。可以从Oracle官网下载JDK安装包,然后按照提示进行安装。 3. 安装Hadoop Hadoop是大数据平台的核心组件之一,需要先安装Hadoop。可以从Apache官网下载Hadoop安装包,并按照官方文档进行安装和配置。 4. 安装Zookeeper Zookeeper是一个分布式协调服务,是Hadoop集群中必不可少的组件之一。可以从Apache官网下载Zookeeper安装包,然后按照官方文档进行安装和配置。 5. 安装Hive Hive是一个基于Hadoop的数据仓库工具,可以方便地进行数据分析和查询。可以从Apache官网下载Hive安装包,然后按照官方文档进行安装和配置。 6. 安装HBase HBase是一个分布式的NoSQL数据库,可以存储海量数据。可以从Apache官网下载HBase安装包,然后按照官方文档进行安装和配置。 7. 安装Spark Spark是一个快速、通用、可扩展的大数据处理引擎。可以从Apache官网下载Spark安装包,然后按照官方文档进行安装和配置。 8. 安装其他组件 根据实际需求,可以安装其他大数据组件,比如Kafka、Flume、Storm等。 9. 配置自动化部署工具 选择一个适合自己的自动化部署工具,比如Puppet、Ansible、Chef等,并按照其官方文档进行配置和使用。 10. 编写部署脚本 根据自己的需求和实际情况,编写自动化部署脚本,包括安装和配置大数据组件、启动和停止服务等。 11. 测试和调试 完成自动化部署平台的搭建后,需要进行测试和调试,确保每个组件都能正常运行,部署过程中没有任何问题。 以上就是基于Hadoop生态的自动化部署平台的搭建步骤。需要注意的是,每个组件的安装和配置都比较复杂,需要仔细阅读官方文档,并按照要求进行操作。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值