hadoop安装 前提配置好jdk8
三台机器,hadoop1,hadoop2,hadoop3
vim /etc/hosts 修改成你对应的ip
- hadoop01 192.168.57.134
- hadoop02 192.168.57.135
- hadoop03 192.168.57.136
hostnamectl set-hostname hadoop01
hostnamectl set-hostname hadoop02
hostnamectl set-hostname hadoop03
配置免密
免密SSH服务的作用一般是有两方面:一是便于虚拟机节点之间免密访问,二是传输数据时会有加解密的过程安全性更高。为了这三个节点间免密登录,比如后面在启动hadoop服务时,主节点启动其它从节点,就需要免密去执行。所以3台机器都执行以下流程,这样三台机器就可以使用ssh连接而无需输入密码了。
输入以下命令,查看ssh进程是否存在(默认是开启的):
ps -e | grep sshd ,没有的话yum install openssh-server -y
sudo yum install openssh-server
sudo systemctl start sshd
sudo systemctl enable sshd
ssh-keygen -t rsa
按3次回车
秘钥拷贝
三台机器的秘钥分别生成之后,需要将各自的秘钥拷贝到其他2台机器,3台机器都执行以下命令:
ssh-copy-id hadoop01
ssh-copy-id hadoop02
ssh-copy-id hadoop03
每条命令中间会有询问,输入“yes”回车,然后输入密码即可:
测试免密登录
# from hadoop1
ssh hadoop02
# from hadoop03
ssh hadoop01
关闭防火墙
sudo systemctl stop firewalld
sudo systemctl disable firewalld
下载地址 Index of /dist/hadoop/common
hadoop-3.1.3.tar.gz
tar -zxvf hadoop-3.1.3.tar.gz -C /export/servers/
Hadoop系统环境配置
vim /etc/profile
export HADOOP_HOME=/export/servers/hadoop-3.1.3
export PATH=:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
source /etc/profile
cd /export/servers/hadoop-3.1.3/etc/hadoop
vim hadoop-env.sh
export JAVA_HOME=/export/servers/jdk
vim core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop01:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/export/software/hadoop-3.2.4/tmp</value>
</property>
</configuration>
hadoop02、hadoop03修改时,把对于域名修改成hadoop02、hadoop03即可。
vim hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop02:50090</value>
</property>
</configuration>
dfs.namenode.secondary.http-address这配置在hadoop02、hadoop03不用配置
vim mapred-site.xml
<configuration>
<!-- 指定MapReduce运行时框架,这里指定在Yarn上,默认是local -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>start
</configuration>
vi yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop01</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
hadoop classpath
将返回的地址也写入配置文件:
vim workers
删除默认的localhost,添加以下内容:
hadoop01
hadoop02
hadoop03
将集群主节点的配置文件分发到其他子节点
scp /etc/profile hadoop02:/etc/profile
scp /etc/profile hadoop03:/etc/profile
scp -r /export/ hadoop02:/
scp -r /export/ hadoop03:/
传完之后要在hadoop02和hadoop03上分别执行 source /etc/profile 命令。
格式化文件系统
# on hadoop01
hdfs namenode -format
启动 start-dfs.sh
在主节点hadoop01上执行
start-yarn.sh
然后3个机器分别 jps 查看进程情况:
浏览器查看Hadoop集群
修改windows下ip映射
修改 C:\Windows\System32\drivers\etc下的hosts文件,添加以下内容:
192.168.20.101 hadoop01
192.168.20.102 hadoop02
192.168.20.103 hadoop03
访问 HDFS 和 Yarn
参考
Hadoop环境搭建及常见问题解决(保姆级教程)_rancher搭建hadoop-CSDN博客
从零开始Hadoop安装和配置,图文手把手教你,定位错误(已部署成功,带配置视频) - 知乎 (zhihu.com)