hadoop集群搭建实录

1. 主机准备

1.1 主机规划

主机IPHostNameCPUMEMERYUSERPWD
hadoop181192.168.207.181hadoop1814 CORE8Ghadoophadoop
hadoop182192.168.207.182hadoop1824 CORE4Ghadoophadoop
hadoop183192.168.207.183hadoop1834 CORE4Ghadoophadoop

1.2 主机初始化

(1) 克隆三台虚拟机
在这里插入图片描述

(2) 创建用户密码(三台都要)

groupadd hadoop
useradd -s /bin/bash -d /home/hadoop -g hadoop hadoop
passwd hadoop    # 设置账户密码都为Hadoop

(4) 关闭防火墙 (三台都要)

systemctl disable firewalld
systemctl stop firewalld

(5)关闭 seLinux(三台都要)

# 关闭防火墙, 设置 SELINUX=disabled 即可 
vim /etc/selinux/config
SELINUX=disabled

在这里插入图片描述

(6)修改ip地址(三台都要)

vim /etc/sysconfig/network-scripts/ifcfg-ens33

(7) 修改hostname (三台都要)

hostnamectl set-hostname hadoop181 # 第一台
hostnamectl set-hostname hadoop182 # 第二台
hostnamectl set-hostname hadoop183 # 第三台

(8) 配置sudo 提权(三台都要)

## Next comes the main part: which users can run what software on 
## which machines (the sudoers file can be shared between multiple
## systems).
## Syntax:
##
##      user    MACHINE=COMMANDS
##
## The COMMANDS section may have other options added to it.
##
## Allow root to run any commands anywhere 
root    ALL=(ALL)       ALL
hadoop  ALL=(ALL)       NOPASSWD:ALL #增加这一行

2. 安装前主机准备

2. 1. 集群命令操作脚本准备

2. 2. 修改hosts文件,追加host配置

[root@hadoop181 ~]# xssh cat /etc/hosts
[DEBUG] 1 command is :cat /etc/hosts
[DEBUG] ssh to hadoop181 to execute commands [ cat /etc/hosts] 
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.207.181 hadoop181
192.168.207.182 hadoop182
192.168.207.183 hadoop183
[DEBUG] ssh to hadoop182 to execute commands [ cat /etc/hosts] 
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.207.181 hadoop181
192.168.207.182 hadoop182
192.168.207.183 hadoop183
[DEBUG] ssh to hadoop183 to execute commands [ cat /etc/hosts] 
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.207.181 hadoop181
192.168.207.182 hadoop182
192.168.207.183 hadoop183
[root@hadoop181 ~]# 

2. 3. 配置hadoop用户的集群间免密登录

源主机目标1目标2目标3
hadoop181hadoop181hadoop182hadoop183
hadoop182hadoop181hadoop182hadoop183
hadoop183hadoop181hadoop182hadoop183
# 密钥生成
[hadoop@hadoop181 ~]$ ssh-keygen -t rsa
[hadoop@hadoop182 ~]$ ssh-keygen -t rsa
[hadoop@hadoop183 ~]$ ssh-keygen -t rsa

# 分发hadoop181 到 其他机器的 密钥命令  
[hadoop@hadoop181 ~]$ ssh-copy-id hadoop@hadoop181
[hadoop@hadoop181 ~]$ ssh-copy-id hadoop@hadoop182
[hadoop@hadoop181 ~]$ ssh-copy-id hadoop@hadoop183

# 分发hadoop182 到 其他机器的 密钥命令 
[hadoop@hadoop182 ~]$ ssh-copy-id hadoop@hadoop181
[hadoop@hadoop182 ~]$ ssh-copy-id hadoop@hadoop182
[hadoop@hadoop182 ~]$ ssh-copy-id hadoop@hadoop183

# 分发hadoop183 到 其他机器的 密钥命令 
[hadoop@hadoop183 ~]$ ssh-copy-id hadoop@hadoop181
[hadoop@hadoop183 ~]$ ssh-copy-id hadoop@hadoop182
[hadoop@hadoop183 ~]$ ssh-copy-id hadoop@hadoop183

2.4 集群时间同步(略)

3.HADOOP 集群搭建

3.1 服务规划

服务hadoop181hadoop182hadoop183
Name Node
DataNode
ResourceManager
NodeManager
HistoryServer
Zookeeper
Secondary NameNode

3.2 安装包准备

(1) hadoop 包下载地址 (hadoop 官网)
(2) 将下载的包 上传到 hadoop181 /home/hadoop/ 路径下

(本次安装使用3.*版本)

在这里插入图片描述

3.3 JDK安装

(1) 解压 jdk

## 解压
[hadoop@hadoop181 ~]$ tar -zxvf jdk-8u144-linux-x64.tar.gz

## 进入到解压后的目录
[hadoop@hadoop181 ~]$ cd jdk1.8.0_144

## 拿到目录地址
[hadoop@hadoop181 jdk1.8.0_144]$ pwd
/home/hadoop/jdk1.8.0_144

(2) 配置环境变量

[hadoop@hadoop181 ~]$ vim .bashrc
# .bashrc

# Source global definitions
if [ -f /etc/bashrc ]; then
        . /etc/bashrc
fi

# Uncomment the following line if you don't like systemctl's auto-paging feature:
# export SYSTEMD_PAGER=

# User specific aliases and functions

# JAVA HOME
export JAVA_HOME=/home/hadoop/jdk1.8.0_144
export PATH=$PATH:$JAVA_HOME/bin

(3) 使配置环境变量生效

[hadoop@hadoop181 ~]$ source .bashrc 
[hadoop@hadoop181 ~]$ 

# 测试java环境是否生效
[hadoop@hadoop181 ~]$ java -version
java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)
[hadoop@hadoop181 ~]$ 
[hadoop@hadoop181 ~]$ 
[hadoop@hadoop181 ~]$ 

(4) jdk 与环境变量分发

[hadoop@hadoop181 ~]$ xsync jdk1.8.0_144
[hadoop@hadoop181 ~]$ xsync .bashrc

(5) 测试

[hadoop@hadoop181 ~]$ xssh java -version
[DEBUG] 1 command is :java -version
[DEBUG] 1 command is :java -version
[DEBUG] ssh to hadoop181 to execute commands [ java -version] 
java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)
[DEBUG] ssh to hadoop182 to execute commands [ java -version] 
java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)
[DEBUG] ssh to hadoop183 to execute commands [ java -version] 
java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)
[hadoop@hadoop181 ~]$ 

3.4 集群配置

3.4.0 安装包处理

(1) 解压hadoop安装包

[hadoop@hadoop181 ~]$ tar -zxvf hadoop-3.1.3.tar.gz

(2)配置环境变量

# 拿到解压目录
[hadoop@hadoop181 hadoop-3.1.3]$ pwd
/home/hadoop/hadoop-3.1.3

# 配置环境变量
[hadoop@hadoop181 ~]$ vim ~/.bashrc
export HADOOP_HOME
# .bashrc

# Source global definitions
if [ -f /etc/bashrc ]; then
        . /etc/bashrc
fi

# Uncomment the following line if you don't like systemctl's auto-paging feature:
# export SYSTEMD_PAGER=

# User specific aliases and functions


# JAVA HOME
export JAVA_HOME=/home/hadoop/jdk1.8.0_144
export PATH=$PATH:$JAVA_HOME/bin

# HADOOP HOME
export HADOOP_HOME=/home/hadoop/hadoop-3.1.3
export PATH=$PATH:$HADOOP_HOME/sbin
export PATH=$PATH:$HADOOP_HOME/bin

(3)使配置生效

[hadoop@hadoop181 ~]$ source .bashrc
3.4.1 核心配置文件

(1) 修改 vim hadoop-env.sh 文件

[hadoop@hadoop181 ~]$ vim $HADOOP_HOME/etc/hadoop/hadoop-env.sh

配置JAVA_HOME 路径

export JAVA_HOME=/home/hadoop/jdk1.8.0_144

(2)修改core-site.xml 文件

[hadoop@hadoop181 ~]$ vim $HADOOP_HOME/etc/hadoop/core-site.xml

增加如下内容

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop181:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/hadoop/hadoop-3.1.3/data/tmp</value>
    </property>
</configuration>
3.4.2 HDFS 配置文件

(1) 配置hdfs-site.xml 文件

[hadoop@hadoop181 ~]$ vim $HADOOP_HOME/etc/hadoop/hdfs-site.xml

配置如下内容

    <!-- 指定HDFS副本的数量 -->
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>
    <!-- 指定Hadoop辅助名称节点主机配置 -->
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>hadoop183:50090</value>
    </property>
3.4.3 YARN 配置文件

(1) 修改 vim hadoop-env.sh 文件

[hadoop@hadoop181 ~]$ vim $HADOOP_HOME/etc/hadoop/yarn-env.sh

配置JAVA_HOME 路径

export JAVA_HOME=/home/hadoop/jdk1.8.0_144

(2) 修改 yarn-site.xml 文件

[hadoop@hadoop181 ~]$ vim $HADOOP_HOME/etc/hadoop/yarn-site.xml

增加如下内容

   <!-- Reducer获取数据的方式 -->
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>

    <!-- 指定YARN的ResourceManager的地址 -->
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>hadoop182</value>
    </property>
3.4.4 MapReduce 配置文件

(1) 修改 mapred-env.sh文件

[hadoop@hadoop181 ~]$ vim $HADOOP_HOME/etc/hadoop/mapred-env.sh

增加JAVA_HOME的配置

export JAVA_HOME=/home/hadoop/jdk1.8.0_144

(2) 修改mapred-site.xml文件

[hadoop@hadoop181 ~]$ vim $HADOOP_HOME/etc/hadoop/mapred-site.xml

增加如下配置

    <!-- 指定MR运行在YARN上 -->
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
3.4.5 日志服务器配置

(1) 历史服务器配置

[hadoop@hadoop181 ~]$ vim $HADOOP_HOME/etc/hadoop/mapred-site.xml

增加如下内容

    <!-- 历史服务器端地址 -->
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>hadoop181:10020</value>
    </property>

    <!-- 历史服务器web端地址 -->
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>hadoop181:19888</value>
    </property>

(2) 日志聚集

[hadoop@hadoop181 ~]$ vim $HADOOP_HOME/etc/hadoop/yarn-site.xml

增加如下内容

    <!-- 日志聚集功能使能 -->
    <property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
    </property>

    <!-- 日志保留时间设置7天 -->
    <property>
        <name>yarn.log-aggregation.retain-seconds</name>
        <value>604800</value>
    </property>
3.4.6 文件分发
[hadoop@hadoop181 ~]$ xsync .bashrc
[hadoop@hadoop181 ~]$ xsync hadoop-3.1.3

3.5 格式化集群

# 在NameNode 节点上执行格式化命令
[hadoop@hadoop181 ~]$ hdfs namenode -format 

3.6 手动启动服务

(1) 启动NameNode

方式1

hadoop-daemon.sh start namenode 

方式2

hdfs --daemon start

(2) 启动DataNode
方式1

hadoop-daemon.sh start datanode 

方式2

hdfs --daemon start

(3) 检查是否启动完成

xssh jps -l

(4)启动ResourceManager
方式1

yarn-daemon.sh start resourcemanager 

方式2

yarn --daemon start

(5)启动NodeManager
方式1

yarn-daemon.sh start nodemanager 

方式2

yarn --daemon start

(6)启动历史服务器

(7)启动secondary NameNode
方式1

mr-jobhistory-daemon.sh start historyserver

方式2

mapred --daemon start

3.7 批量启动集群

(1) 增加workers配置

[hadoop@hadoop181 ~]$ vim $HADOOP_HOME/etc/hadoop/workers

增加内容如下内容(需要分发到所有机器)

hadoop181
hadoop182
hadoop183

(1)批量启动hdfs

start-dfs.sh
stop-dfs.sh

(2)批量起停yarn

start-yarn.sh
stop-yarn.sh

(3)全部起,全部停

start-all.sh
stop-all.sh

3.8 查看集群

(1) hdfs 浏览器页面查看

访问地址 http://hadoop181:9870/

在这里插入图片描述
(2) secondaryNameNode状态查看

http://hadoop183:50090/status.html

NOTE:
这个SecondaryNameNode 状态可能不会显示,这个时候修改 dfs-dust.js 文件即可解决

[hadoop@hadoop181 ~]$ vim $HADOOP_HOME/share/hadoop/hdfs/webapps/static/dfs-dust.js
[hadoop@hadoop181 ~]$ xsync $HADOOP_HOME/share/hadoop/hdfs/webapps/static/dfs-dust.js

修改内容如下, 注释moment的行,增加一个新行返回Date的格式即可
在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值