脚本化部署Hadoop完全分布式集群(centos7 minimal)

前置条件:配置好网络

初始化环境

yum install -y epel-release
yum -y install wget
cd /etc/yum.repos.d
mv ./CentOS-Base.repo ./CentOS-Base.repo.bak
wget http://mirrors.163.com/.help/CentOS7-Base-163.repo
mv CentOS7-Base-163.repo /etc/yum.repos.d/CentOS-Base.repo
yum clean all;yum makecache
yum -y update

yum groups install Development\ Tools -y
yum install -y ntp vim lrzsz lsof pcre pcre-devel zlib zlib-devel ruby unzip zip net-tools gcc-c++

关闭防火墙


sudo systemctl stop firewalld
sudo systemctl disable firewalld
setenforce 0

安装jdk

  1. 下载https://www.aliyundrive.com/s/V4kVwG9MTPn

  2. rpm -ivh jdk-8u301-linux-x64.rpm

  3. 不需要配java环境变量

安装mysql

wget -i -c http://dev.mysql.com/get/mysql57-community-release-el7-10.noarch.rpm
yum -y install mysql57-community-release-el7-10.noarch.rpm 
yum -y install mysql-community-server
service mysqld start
systemctl enable mysqld
 
获取mysql临时密码
grep 'temporary password' /var/log/mysqld.log
登录mysql
mysql -uroot -p
 
修改密码策略
set global validate_password_policy=0; 
set global validate_password_length=1;
ALTER USER 'root'@'localhost' IDENTIFIED BY '123456';
create database hive default character set utf8 default collate utf8_general_ci;
GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '123456' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON *.* TO 'root'@'localhost' IDENTIFIED BY '123456' WITH GRANT OPTION;
FLUSH  PRIVILEGES;

创建用户

useradd hadoop

设置密码

passwd hadoop

设置完全执行权限

vi  /etc/sudoers

echo "hadoop   ALL=(ALL)       NOPASSWD:ALL">>/etc/sudoers

创建用户环境变量

vi  /etc/profile.d/hadoop.env

添加环境变量

修改登录用户,使环境环境变量生效

echo "source /etc/profile.d/hadoop.env">>~/.bashrc

上传xsync文件到/usr/bin/

使用auto_ssh_host.sh自动配置免密登录和hosts

步骤一

软件放在/opt/module目录下

确保多个集群节点配置正确

确保环境变量正常 source /etc/profile.d/hadoop.env

确保文件夹权限正常 chown -R hadoop:hadoop /opt/module

基本文件目录结构

 xsync /opt/module/hadoop-3.1.3/

配置hadoop文件

core-site.xml

<configuration>
	<property>
		<name>fs.defaultFS</name>
		<value>hdfs://master:9820</value>
		<description>文件系统</description>
	</property>
	
	<property>
		<name>hadoop.tmp.dir</name>
		<value>/opt/module/tmp/hadoop</value>
		<description>临时文件夹</description>
	</property>
	
	<property>
		<name>io.file.buffer.size</name>
		<value>131072</value>
		<final>4096</final>
		<description>流文件的缓冲区为4K</description>
	</property>
	
	<property>
		<name>hadoop.http.staticuser.user</name>
		<value>zakza</value>
		<description>用户</description>
	</property>
</configuration>

hdfs-site.xml

<configuration>

	<!--nn web-->
	<property>
		<name>dfs.namenode.http-address</name>
		<value>master:9870</value>
		<description>namenode web</description>
	</property>

	<!--2nn web-->
	<property>
		<name>dfs.namenode.secondary.http-address</name>
		<value>node2:9868</value>
		<description>secondary namenode web</description>
	</property>

	<property>
		<name>dfs.replication</name>
		<value>3</value>
		<description> </description>
	</property>
</configuration>

mapred-site.xml

<configuration>
	<property>
		<name>mapreduce.framework.name</name>
		<value>yarn</value>
		<description>The runtime framework for executing MapReduce jobs. Can be one of local, classic or yarn.</description>
	</property>
	
	<property>
		<name>mapreduce.jobhistory.address</name>
		<value>master:10020</value>
	</property>
	
	<property>
		<name>mapreduce.jobhistory.webapp.address</name>
		<value>master:19888</value>
	</property>

</configuration>

yarn-site.xml

<configuration>

	<property>
		<name>yarn.log-aggregation-enable</name>
		<value>true</value>
		<description>日志聚集功能</description>
	</property>

	<!--指定resourcemanager-->
	<property>
		<name>yarn.resourcemanager.hostname</name>
		<value>node1</value>
		<description>resourcemanager</description>
	</property>

	<property>
		<name>yarn.log.server.url</name>
		<value>http://master:19888/jobhistory/logs</value>
		<description>日志web地址</description>
	</property>

	<!-- Site specific YARN configuration properties -->
	<!--指定MR走shuffle-->
	<property>
		<name>yarn.nodemanager.aux-services</name>
		<value>mapreduce_shuffle</value>
	</property>

	<!--环境变量继承-->
	<property>
		<name>yarn.nodemanager.env-whitelist</name>
		<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_HOME,HADOOP_MAPRED_HOME</value>
		<description>环境变量继承</description>
	</property>

	<property>
		<name>yarn.log-aggregation.retain-seconds</name>
		<value>604800</value>
		<description>日志聚集功能</description>
	</property>


</configuration>

xsync /opt/module/hadoop-3.1.3/etc/hadoop

启动集群

第一次启动格式化,如果data错误或丢失,要删除所有节点的data\ logs\目录,再重新格式化

hdfs namenode -format

start-dfs.sh

start-yarn.sh

mapred --daemon start historyserver

node1上启动resourcemanager

yarn --daemon start resourcemanager

masternode1node2
HDFSnn2nn
dndndn
YARNhistoryserverrm
nmnmnm

注:

  • nn:NameNode
  • dn:DataNode
  • nm:NodeManager
  • 2nn:SecondaryNameNode
  • rm:ResourceManager

访问地址

集群地址http://master:9870/

日志地址http://master:19888/jobhistory​​​​​​​
yarn地址http://node1:8088/cluster

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值