【大数据学习】hadoop环境搭建

首先,安装JDK,这里就不细说了。
这里使用的是hdfs用户进行安装的,先在五台机器上创建hdfs用户:

useradd hdfs

设置密码,我设置了000000

passwd hdfs

一、修改/etc/hosts文件

先配置sudo(五台都配置):

vi /etc/sudoers

root ALL=(ALL) ALL下添加

hdfs    ALL=(ALL)       NOPASSWD: ALL

配置hosts文件(五台都配置) vi /etc/hosts

192.168.66.161   node01
192.168.66.162   node02
192.168.66.163   node03
192.168.66.164   node04
192.168.66.165   node05

二、配置免秘钥

su hdfs
ssh-keygen

下面这个需要执行25次…写成脚本吧

ssh-copy-id node01.....node05

三、安装zookeeper

安装在node02、node03、node04节点。使用root用户安装

sudo mkdir -p /opt/software

将包上传到/opt/tools下面。

tar -zxvf /opt/tools/zookeeper-3.4.10.tar.gz -C /opt/software/

这些在node02节点执行即可:

mkdir -p /opt/software/zookeeper-3.4.10/zkData
 mv /opt/software/zookeeper-3.4.10/conf/zoo_sample.cfg /opt/software/zookeeper-3.4.10/conf/zoo.cfg
vi /opt/software/zookeeper-3.4.10/conf/zoo.cfg

dataDir=/opt/software/zookeeper-3.4.10/zkData
server.2=node02:2888:3888
server.3=node03:2888:3888
server.4=node04:2888:3888

vi /opt/software/zookeeper-3.4.10/zkData/myid
2

分发node03、node04:

scp -r /opt/software/zookeeper-3.4.10/ node03:/opt/software/

在node03和node04修改myid文件为3、4

vi /opt/software/zookeeper-3.4.10/zkData/myid

创建启动脚本:

chown -R hdfs:hdfs /opt/software/
mkdir /home/hdfs/bin
/home/hdfs/bin/zk.sh

zk启动脚本:zk.sh

#! /bin/bash

case $1 in
"start"){
	for i in node02 node03 node04
	do
		ssh $i "source /etc/profile;/opt/software/zookeeper-3.4.10/bin/zkServer.sh start"
	done
};;
"stop"){
	for i in node02 node03 node04
	do
		ssh $i "source /etc/profile;/opt/software/zookeeper-3.4.10/bin/zkServer.sh stop"
	done
};;
"status"){
	for i in node02 node03 node04
	do
		ssh $i "source /etc/profile;/opt/software/zookeeper-3.4.10/bin/zkServer.sh status"
	done
};;
esac


当然,也可以在node02、node03、node04单独启动:

/opt/software/zookeeper-3.4.10/bin/zkServer.sh stop

四、安装hadoop

node01和node05作为namenode,node03、node04、node05作为datanode

1、配置JDK环境变量:
(hadoop-env.sh、mapred-env.sh、yarn-env.sh)

export JAVA_HOME=/opt/software/jdk1.8.0_144

2、解压

tar -zxvf /opt/tools/hadoop-2.7.2.tar.gz -C /opt/software/

3、修改hdfs-site.xml

<configuration>
   <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>
    <property>
		<name>dfs.nameservices</name>
		<value>mycluster</value>
	</property>	
	<property>
		<name>dfs.ha.namenodes.mycluster</name>
		<value>nn1,nn2</value>
	</property>	
	<property>
		<name>dfs.namenode.rpc-address.mycluster.nn1</name>
		<value>node01:8020</value>
	</property>
	<property>
		<name>dfs.namenode.rpc-address.mycluster.nn2</name>
		<value>node05:8020</value>
	</property>
	<property>
		<name>dfs.namenode.http-address.mycluster.nn1</name>
		<value>node01:50070</value>
	</property>
	<property>
		<name>dfs.namenode.http-address.mycluster.nn2</name>
		<value>node05:50070</value>
	</property>
	<property>
		<name>dfs.namenode.shared.edits.dir</name>
		<value>qjournal://node02:8485;node03:8485;node04:8485/mycluster</value>
	</property>
	<property>
		<name>dfs.client.failover.proxy.provider.mycluster</name>
		<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
	</property>
	<property>
		<name>dfs.ha.fencing.methods</name>
		<value>sshfence</value>
	</property>
	<property>
		<name>dfs.ha.fencing.ssh.private-key-files</name>
		<value>/home/hdfs/.ssh/id_rsa</value>
	</property>
	<property>
		<name>dfs.ha.automatic-failover.enabled</name>
		<value>true</value>
	</property>
		<!-- 声明journalnode服务器存储目录-->
	<property>
		<name>dfs.journalnode.edits.dir</name>
		<value>/opt/software/hadoop-2.7.2/jn</value>
	</property>

	<!-- 关闭权限检查-->
	<property>
		<name>dfs.permissions.enable</name>
		<value>false</value>
	</property>

</configuration>

4、修改core-site.xml

<configuration>
	<property>
        <name>fs.defaultFS</name>
        <value>hdfs://mycluster</value>
    </property>
	
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/opt/software/hadoop-2.7.2/ha</value>
    </property>
	
	<property>
		<name>ha.zookeeper.quorum</name>
		<value>node02:2181,node03:2181,node04:2181</value>
	</property>
</configuration>

5、修改mapred-site.xml

<configuration>
	<property>
		<name>mapreduce.framework.name</name>
		<value>yarn</value>
	</property>
</configuration>

6、修改yarn-site.xml

<configuration>
 <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <!--启用resourcemanager ha-->
    <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
    </property>
    <!--声明两台resourcemanager的地址-->
    <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>cluster-yarn1</value>
    </property>
    <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>node01</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>node05</value>
    </property>
    <!--指定zookeeper集群的地址--> 
    <property>
        <name>yarn.resourcemanager.zk-address</name>
        <value>node02:2181,node03:2181,node04:2181</value>
    </property>
    <!--启用自动恢复--> 
    <property>
        <name>yarn.resourcemanager.recovery.enabled</name>
        <value>true</value>
    </property>
    <!--指定resourcemanager的状态信息存储在zookeeper集群--> 
    <property>
        <name>yarn.resourcemanager.store.class</name>     
		<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
	</property>

</configuration>

7、修改slaves文件

node02
node03
node04

8、配置环境变量

##HADOOP_HOME
export HADOOP_HOME=/opt/software/hadoop-2.7.2
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin

9、格式化
在node01、node02、node03必须先启动journalnode

hadoop-daemon.sh start journalnode

在node01执行:

hdfs namenode -format

在node01执行:

hadoop-daemon.sh start namenode 

在ndoe05执行:同步nn1的元数据信息

hdfs namenode -bootstrapStandby

在node01执行:格式化ZKFC

hdfs zkfc -formatZK

10、启动

在node01:

start-dfs.sh
start-yarn.sh

在node05执行:

yarn-daemon.sh  start resourcemanager

UI:
http://192.168.66.161:50070/dfshealth.html#tab-overview
http://192.168.66.161:8088/cluster

jps脚本:

#!/bin/bash

for (( i=1  ; i <= 04  ; i=$i + 1 )) ; do 
	echo ===================== node0$i  ===========================
	ssh node0$i /opt/software/jdk1.8.0_144/bin/jps
done

到这里就已经安装完成了。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值