Hadoop集群搭建教程
1.需求
需求: jdk安装包、
Hadoop安装包
2.先创建一台虚拟机
过程就不演示了
3.关闭防火墙
1)查看防火墙 service iptables status
2)关闭防火墙 (临时)service iptables stop
3)关闭防护墙 (永久)chkconfig iptables off
注:永久关闭防火墙需要重启(重启命令 :init 6)
4.修改ip地址
4.1.修改ip命令
4.2.看看有没有ip地址
4.3.看看有没有(这里“#”是注释)以下没有的加上 IP自己设置
*#网卡名称
DEVICE=eth0
#类型
TYPE=Ethernet
#唯一标识
UUID=8844c7b1-4962-4020-9b01-6dff388fc44c
#开机自动启用网络连接
ONBOOT=yes
#network manager的参数是否可以由Network Manager托管
NM_CONTROLLED=yes
#none禁止DHCP,static启用静态IP地址,dhcp开启DHCP服务
BOOTPROTO=none
#IP地址
IPADDR=192.168.80.101
#子网掩码
NETMASK=255.255.255.0
#子网掩码24位
PREFIX=24
#设置网关
GATEWAY=192.168.80.2
IPV4_FAILURE_FATAL=yes
#禁止IPV6
IPV6INIT=no
DNS1=192.168.80.2
DEFROUTE=yes
NAME="System eth0"
HWADDR=00:0C:29:59:F5:71*
4.4.要是没网的话就看以下设置
在自己win下网络设置找到自己虚拟机的网络点击属性
双击Internet协议版本4设置
4.5.看看是不是4个,完后输入ifconfig查看ip
查看ip
5.配置hosts
配置命令: vi /etc/hosts
6.克隆
1)根据主机克隆2台虚拟机
2)修改2台辅机的ip 和 mac地址 (命令: vi /etc/sysconfig/network-scripts/ifcfg-eth0)
注:设置完看看这台虚拟机内能不能上网
3)修改2台辅机的主机名(命令: vi /etc/sysconfig/network)把HOSTNAME修改成这台机的主机名
注:修改完主机名重启才生效
4)ping www.baidu.com(看一下网络能正常使用不)
7.设置ssh免密登陆
1).生成密匙(3台机都执行)
[root@net2 ~]# ssh-keygen -t rsa
2).touch authorized_keys 创建 authorized_keys文件(需要进入 .ssh目录
)
第一台机
进入 .ssh目录
[root@net1 ~]# cd /root/.ssh/
创建公匙 (authorized_keys)
[root@net1 .ssh]# touch authorized_keys
把密码写入 公匙 中
[root@net1 .ssh]# cat id_rsa.pub >> authorized_keys
把authorized_keys文件发送到(net2【写IP地址也行】)另一台虚拟机中
[root@net1 .ssh]# scp authorized_keys net2:/root/.ssh/
The authenticity of host 'net2 (192.168.80.98)' can't be established.
RSA key fingerprint is 8f:90:02:e7:96:b1:40:90:8c:05:bf:74:63:04:37:da.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'net2,192.168.80.98' (RSA) to the list of known hosts.
root@net2's password:
authorized_ 100% 391 0.4KB/s 00:00
[root@net1 .ssh]#(到这就发送成功了)
第二台机
进入 .ssh目录
[root@net2 ~]# cd /root/.ssh
把密钥写入authorized_keys
[root@net2 .ssh]# cat id_rsa.pub >> authorized_keys
把authorized_keys发送到第三台机中
[root@net2 .ssh]# scp authorized_keys net3:/root/.ssh/
The authenticity of host 'net3 (192.168.80.99)' can't be established.
RSA key fingerprint is 8f:90:02:e7:96:b1:40:90:8c:05:bf:74:63:04:37:da.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'net3,192.168.80.99' (RSA) to the list of known hosts.
root@net3's password:
authorized_ 100% 782 0.8KB/s 00:00
[root@net2 .ssh]#(到这就发送成功了)
第三台机
进入 .ssh目录
[root@net3 ~]# cd /root/.ssh
把密钥写入authorized_keys
[root@net3 .ssh]# cat id_rsa.pub >> authorized_keys
把authorized_keys写入net1和net2两台机中
[root@net3 .ssh]# scp authorized_keys net1:/root/.ssh/
The authenticity of host 'net1 (192.168.80.97)' can't be established.
RSA key fingerprint is 8f:90:02:e7:96:b1:40:90:8c:05:bf:74:63:04:37:da.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'net1,192.168.80.97' (RSA) to the list of known hosts.
root@net1's password:
authorized_ 100% 1173 1.2KB/s 00:00
[root@net3 .ssh]#(到这就发送完了)
[root@net3 .ssh]# scp authorized_keys net2:/root/.ssh/
The authenticity of host 'net2 (192.168.80.98)' can't be established.
RSA key fingerprint is 8f:90:02:e7:96:b1:40:90:8c:05:bf:74:63:04:37:da.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'net2,192.168.80.98' (RSA) to the list of known hosts.
root@net2's password:
authorized_ 100% 1173 1.2KB/s 00:00
[root@net3 .ssh]#(到这就发送完了)
好了就用第一台机执行ssh net2
[root@net1 .ssh]# ssh net2
(同时也试一下ssh net3)
8.解压hadoop和jdk
1).这个是我习惯安装的位置你们可以安装自己的习惯安装
[root@net1 ~]# cd /home
[root@net1 home]# mkdir soft
[root@net1 home]# cd soft
[root@net1 soft]# mkdir hadoop
[root@net1 soft]# mkdir JDK
2).我们要上传文件运行命令:yum install lrzsz
[root@net1 soft]# yum install lrzsz
完后输入rz直接回车
[root@net1 hadoop]# rz
[root@net1 JDK]# rz
开始解压hadoop和jdk
[root@net1 hadoop]# tar -zxvf hadoop-2.8.5.tar.gz
[root@net1 JDK]# tar -zxvf jdk-8u191-linux-x64.tar.gz
9.配置Hadoop和jdk环境变量
[root@net1 JDK]# vi /etc/profile
找到最后一行写入
重新刷新下文件加载进去
[root@net1 JDK]# source /etc/profile
完后输入java或者输入hadoop测试一下能用不
[root@net1 JDK]# java
[root@net1 JDK]# hadoop
到这就配置完成了net1的hadoop和jdk
10配置Hadoop配置文件
[root@net1 home]# cd /home/soft/hadoop/hadoop-2.8.5/etc/hadoop/
[root@net1 hadoop]# ll
1.配置core-site.xml
[root@net1 hadoop]# vi core-site.xml
<!--在<configuration></configuration>中写入以下配置-->
<property>
<name>fs.defaultFS</name>
<value>hdfs://node1:8020</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>4096</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/soft/tmp</value>
</property>
[root@net1 hadoop]# vi hadoop-env.sh
把JAVA_HOME修改成以下配置
export JAVA_HOME="/home/soft/JDK/jdk1.8.0_191"
3.配置hdfs-site.xml
[root@net1 hadoop]# vi hdfs-site.xml
<!--在<configuration></configuration>中写入以下配置-->
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.block.size</name>
<value>134217728</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/soft/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/soft/data</value>
</property>
<property>
<name>fs.checkpoint.dir</name>
<value>file:///home/soft/cname</value>
</property>
<property>
<name>fs.checkpoint.edits.dir</name>
<value>file:///home/soft/cname</value>
</property>
<property>
<name>dfs.http.address</name>
<value>net1:50070</value>
</property>
<property>
<name>dfs.secondary.http.address</name>
<value>net2:50090</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
4.配置yarn-site.xml
[root@net1 hadoop]# vi yarn-site.xml
<!--在<configuration></configuration>中写入以下配置-->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>net1</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>net1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>net1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>net1:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>net1:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>net1:8088</value>
</property>
5.配置slaves
[root@net1 hadoop]# vi slaves
net1
net2
net3
6.配置mapred-site.xml
[root@net1 hadoop]# cp mapred-site.xml.template mapred-site.xml
[root@net1 hadoop]# vi mapred-site.xml
<!--在<configuration></configuration>中写入以下配置-->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<final>true</final>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>net1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>net1:19888</value>
</property>
net2和net3的hadoop可以把第一台的发送过去不用在一台一台配置
1.发送到net2
[root@net1 home]# scp -r soft net2:/home/
2.发送到net3
[root@net1 home]# scp -r soft net3:/home/
3.在进行配置net2和net3的环境变量就好了
11.开启hadoop
1). 第一次加载先格式化(第二次直接开启就行了)
[root@net1 home]# hdfs namenode -format
2). 开启hadoop
[root@net1 home]# start-all.sh
3). 进行输入jps测试
以下三台机出现的节点,就表示成功了
[root@net1 home]# jps
3185 NameNode
3316 DataNode
3558 ResourceManager
3975 Jps
3657 NodeManager
[root@net1 home]# ssh net2
Last login: Sun Jan 6 00:12:53 2019 from net1
[root@net2 ~]# jps
3088 SecondaryNameNode
2998 DataNode
3302 Jps
3159 NodeManager
[root@net2 ~]# ssh net3
Last login: Sun Jan 6 00:16:27 2019 from net1
[root@net3 ~]# jps
3476 Jps
3332 NodeManager
3237 DataNode
[root@net3 ~]#
4.完后进行网页测试
在网站上输入 http://192.168.80.97:50070 进入(如果节点全部起来,网页打不开,查看下防火墙关了没有)
注:这里的IP是你们主机自己设置的IP号(这是我的IP号:192.168.80.97)
在网站上输入 http://192.168.80.97:8088 进入