【运维】hadoop3.0.3集群安装(一)多节点安装

一.Purpose

This document describes how to install and configure Hadoop clusters ranging from a few nodes to extremely large clusters with thousands of nodes.
This document does not cover advanced topics such as Security or High Availability.

此文章目的在于多节点hadoop(从几个节点到上千个节点)的安装,但这里不包括高可用和安全相关的内容。

 

二. Prerequisites

  • java 8
  • 稳定版的hadoop镜像:本文下载的是hadoop3.0.3版本

 

三. Installation

Typically one machine in the cluster is designated as the NameNode and another machine as the ResourceManager, exclusively. These are the masters.
Other services (such as Web App Proxy Server and MapReduce Job History server) are usually run either on dedicated hardware or on shared infrastructure, depending upon the load.
The rest of the machines in the cluster act as both DataNode and NodeManager. These are the slaves.

  • 管理节点:通常,集群中的一台机器被指定为NameNode,另一台机器被指定为ResourceManager。
  • 工作节点:集群中的其余机器同时充当DataNode和NodeManager。
  • 其他服务:(如Web App Proxy Server和MapReduce Job History Server)通常在专用硬件或共享基础设施上运行,具体取决于负载,这里我放在了除管理节点之外的节点

 

1. 节点规划

根据上面的建议,我这里选择了两个安装节点进行组件规划

节点hdfs组件yarn组件
10.xxx(node1)namenode、datanoderesourcemanager、nodemanager
10.xxx(node2)secondaryNameNode、datanodenodemanager、jobHistorynode

 

2. Configuring Hadoop in Non-Secure Mode

HDFS daemons are NameNode, SecondaryNameNode, and DataNode. YARN daemons are ResourceManager, NodeManager, and WebAppProxy. If MapReduce is to be used, then the MapReduce Job History Server will also be running. For large installations, these are generally running on separate hosts.

Hdfs 包括:namenode、secondaryNamenode、datanode
yarn包括:resourcemanager、nodemanger、和WebAppProxy(暂时没有规划此进程)
如果运行mr,则MapReduce Job History Server也需要

注意:

对于大型安装,上述组件都是分散在不同机器中的。

 

3. 准备工作

每个节点【node1、node2】操作:

mkdir -p /home/user/hadoop
cd   /home/user/hadoop
tar -zxvf hadoop.tar.gz
ln -s   hadoop-3.0.3 hadoop

 

设置环境变量:

vim ~/.bashrc 

# 添加如下内容
export HADOOP_HOME=/home/user/hadoop/hadoop
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export HADOOP_CONF_DIR=/home/user/hadoop/hadoop/etc/hadoop


# 执行
source ~/.bashrc 

 

4. 配置

在/{user_home}/hadoop/hadoop/etc/hadoop/ 下

core-site.xml


<configuration>
 <property>
        <name>fs.defaultFS</name>
        <value>hdfs://namenodeIp:9000</value>
        <description>
        ip 为namenode所在ip
        </description>
    </property>
</configuration>   

 

hdfs-site.xml

  <!-- ===========namenode===========   -->  
  
  <property>  
	 <name>dfs.namenode.name.dir</name>  
 	<value>/opt/data/hdfs/namenode,/opt/data02/hdfs/namenode</value>  
	 <description>If this is a comma-delimited list of directories then the name table is replicated in all of the  
            directories, for redundancy.  
            Path on the local filesystem where the NameNode stores the namespace and transactions logs persistently.  
            用于保存Namenode的namespace和事务日志的路径  
      </description>  
 </property>  
  <!-- ===========namenode===========   -->  
  
 <!-- ===========datanode===========   -->  
 <property>  
	 <name>dfs.datanode.data.dir</name>  
	 <value>/opt/data/hdfs/data,/opt/data02/hdfs/data</value>  
 	<description>
		 If this is a comma-delimited list of directories, 
		 then data will be stored in all named directories, 
		 typically on different devices.
     </description>  
 </property>  

 

yarn-site.xml

  
 <!--  Configurations for ResourceManager:   -->  
	 <property>  
		 <name>yarn.resourcemanager.address</name>  
		 <value>node1:8832</value>  
	 </property>  
	  
	 <property> 
		 <name>yarn.resourcemanager.scheduler.address</name>  
		 <value>node1:8830</value>  
	 </property>  
	  
	 <property> 
		 <name>yarn.resourcemanager.resource-tracker.address</name>  
		 <value>node1:8831</value>    
	 </property>  
	  
	 <property> 
	 	<name>yarn.resourcemanager.admin.address</name>  
	 	<value>node1:8833</value>  
	 </property>  
	  
	 <property> 
		 <name>yarn.resourcemanager.webapp.address</name>  
		 <value>node1:8888</value>  
	 </property> 
	  
	 <property> 
		 <name>yarn.resourcemanager.hostname</name>  
		 <value>rmhostname</value>  
	 </property>  
	  
	 <property>
		 <name>yarn.nodemanager.local-dirs</name>  
		 <value>/data/yarn/nm-local-dir,/data02/yarn/nm-local-dir</value>   
	 </property>  
	  
	 <property> 
		 <name>yarn.nodemanager.log-dirs</name>  
		 <value>/home/taiyi/hadoop/yarn/userlogs</value>  
	 </property>  

  
	 <property> 
		 <name>yarn.nodemanager.remote-app-log-dir</name>  
		 <value>/home/taiyi/hadoop/yarn/containerlogs</value>  
	 </property>
	 
	<property>  
	 	<name>yarn.nodemanager.resource.memory-mb</name>  
	 	<value>61440</value>
	 	<description>通过free -h 查看机器具体内存设定
		</description>
	</property>

 

mapred-site.xml

<!--Configurations for MapReduce JobHistory Server:-->  
	<property>  
		 <name>mapreduce.jobhistory.address</name>  
		 <value>node2:10020</value>  
	</property>  
	  
	<property>  
		 <name>mapreduce.jobhistory.webapp.address</name>  
		 <value>node2:19888</value>   
	</property>  
 
<!--Configurations for MapReduce JobHistory Server:-->

 

workers

配置工作节点

node1
node2

 

4. 分发配置、创建文件夹

配置分发到另外一个节点

scp -r   \
/home/user/hadoop/hadoop/etc/hadoop/  \
root@node2hostname:/home/user/hadoop/hadoop/etc/

所有节点创建文件夹

mkdir -p /data/yarn/nm-local-dir /data02/yarn/nm-local-dir
chown -R user:user /data/yarn /data02/yarn

mkdir -p /opt/data/hdfs/namenode /opt/data02/hdfs/namenode /opt/data/hdfs/data /opt/data02/hdfs/data
chown -R user:user /opt/data /opt/data02

 

5. 格式化

namenode所在节点执行

hdfs namenode -format

如果看到这些信息格式化成功

2022-08-12 17:43:11,039 INFO common.Storage: Storage directory /Users/lianggao/MyWorkSpace/002install/hadoop-3.3.1/hadoop_repo/dfs/name 
has been successfully formatted.

2022-08-12 17:43:11,069 INFO namenode.FSImageFormatProtobuf: Saving image file /Users/lianggao/MyWorkSpace/002install/hadoop-3.3.1/hadoop_repo/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
2022-08-12 17:43:11,200 INFO namenode.FSImageFormatProtobuf: Image file /Users/lianggao/MyWorkSpace/002install/hadoop-3.3.1/hadoop_repo/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 403 bytes saved in 0 seconds .

如果格式化失败需要先删除nn的管理目录
 
因为格式化的时候是创建了nn文件的管理目录 common.Storage: Storage directory /data/hadoopdata/name has been successfully formatted.

 

6. 操作进程

6.1. hdfs

启动

node1

hdfs --daemon start namenode
hdfs --daemon start datanode

node2

hdfs --daemon start secondarynamenode
hdfs --daemon start datanode
停止
hdfs --daemon stop namenode
hdfs --daemon stop secondarynamenode
hdfs --daemon stop datanode

 

6.2. yarn

启动

node1

yarn --daemon start resourcemanager
yarn --daemon start nodemanager

node2

mapred --daemon start historyserver
yarn --daemon start nodemanager
停止
yarn --daemon stop resourcemanager
yarn --daemon stop nodemanager
mapred --daemon stop historyserver

 

7. 访问

http://node1:9870/
http://node2:8088/

在这里插入图片描述

在这里插入图片描述

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

roman_日积跬步-终至千里

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值