一、环境搭建需求
基于CentOS 7 , jdk1.8 和Hadoop2.7.7环境搭建,准备好CentOS7 64位的镜像,然后在VMware上安装虚拟机
1、创建centos 7系统
参考文章地址:点我一下!
2、配置静态ip,使用finalshell远程登录
参考文章地址: 点我一下!
3、关闭防火墙
systemctl status firewalld.service #查看防火墙状态
systemctl stop firewalld.service #关闭防火墙
systemctl disable firewalld.service #禁用防火墙开机自启
二、环境配置
1、主机名修改
命令修改主机名
通过命令 hostnamectl set-hostname “主机名” 来修改主机名; 重启生效
[root@localhost ~]# hostnamectl set-hostname “hadoop”
[root@localhost ~]# reboot
文件修改主机名
编辑 /etc/ hostname 文件,删除原有内容,添加主机名:hadoop,保存退出
[root@localhost ~]# vim /etc/hostname
[root@hadoop ~]# cat /etc/hostname
hadoop
2、ip映射
修改hosts文件
vim /etc/hosts
写入文件ip映射: 192.168.3.100 hadoop
[root@hadoop ~]# vim /etc/hosts
[root@hadoop ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.3.100 hadoop
重启网络
systemctl restart network
3、免密登录
- 这里一路回车
[root@hadoop ~]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:mVpnpT1Yg107cJ4YX8Jx+8zDgoukRFSK6QVtrbCkjJM root@hadoop
The key's randomart image is:
+---[RSA 2048]----+
| ...o. o.=.o|
| o=o.. o O.*.|
| + oo++. . * *. |
| E o..o. o *...+.|
| . . .S.=.o. o+|
| .ooo. ... .|
| .. . . |
| |
| |
+----[SHA256]-----+
- 这里需要输入yes;填写root密码
[root@hadoop ~]# ssh-copy-id localhost
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'localhost (::1)' can't be established.
ECDSA key fingerprint is SHA256:Sd/GXoJN+lri1JsFgCOJ0msEDLxUn0weUKSxetiRsT4.
ECDSA key fingerprint is MD5:bb:a4:1f:43:e2:0f:00:6d:c0:9d:55:07:e7:80:0f:9f.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@localhost's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'localhost'"
and check to make sure that only the key(s) you wanted were added.
- 验证Hadoop是否可以代替ip
[root@hadoop ~]# ssh hadoop
The authenticity of host 'hadoop (192.168.3.100)' can't be established.
ECDSA key fingerprint is SHA256:Sd/GXoJN+lri1JsFgCOJ0msEDLxUn0weUKSxetiRsT4.
ECDSA key fingerprint is MD5:bb:a4:1f:43:e2:0f:00:6d:c0:9d:55:07:e7:80:0f:9f.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'hadoop,192.168.3.100' (ECDSA) to the list of known hosts.
Last login: Sun Nov 19 22:46:53 2023 from 192.168.3.1
三、环境变量配置
1、上传jdk与Hadoop包
- 点这里上传文件
- 没有文件百度搜索
2、解压jdk与hadoop压缩包,配置环境变量
- 使用解压命令解压到 /opt/目录下
[root@hadoop hadoop]# ls
hadoop-2.7.7.tar.gz jdk-8u212-linux-x64.tar.gz
tar -zxvf jdk-8u212-linux-x64.tar.gz -C /opt/
tar -zxvf hadoop-2.7.7.tar.gz -C /opt/
- 修改名称
[root@hadoop opt]# ls
hadoop-2.7.7 jdk1.8.0_212 rh
[root@hadoop opt]# mv /opt/hadoop-2.7.7/ hadoop
[root@hadoop opt]# mv /opt/jdk1.8.0_212/ jdk
[root@hadoop opt]# ls
- 编辑文件 /etc/profile
vim /etc/profile
- 文件末尾填写jdk与Hadoop环境变量
export JAVA_HOME=/opt/jdk
export HADOOP_HOME=/opt/hadoop
export PATH=.:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$JAVA_HOME/bin:$PATH
- 更新文件
source /etc/profile
四、hadoop文件配置
文件所在位置 cd /opt/hadoop/etc/hadoop
[root@hadoop hadoop]# ls
capacity-scheduler.xml hadoop-policy.xml kms-log4j.properties ssl-client.xml.example
configuration.xsl hdfs-site.xml kms-site.xml ssl-server.xml.example
container-executor.cfg httpfs-env.sh log4j.properties yarn-env.cmd
core-site.xml httpfs-log4j.properties mapred-env.cmd yarn-env.sh
hadoop-env.cmd httpfs-signature.secret mapred-env.sh yarn-site.xml
hadoop-env.sh httpfs-site.xml mapred-queues.xml.template
hadoop-metrics2.properties kms-acls.xml mapred-site.xml.template
hadoop-metrics.properties kms-env.sh slaves
1、配置文件:hadoop-env.sh
vim hadoop-env.sh
export JAVA_HOME=/opt/jdk
注释,重写
2、配置文件:core-site.xml
注意:配置文件不一定存在的,没有则先创建
vim core-site.xml
<property>
<!-- 配置NameNode的主机名和端口号 -->
<name>fs.defaultFS</name>
<value>hdfs://hadoop:8020</value>
</property>
3、配置文件:hdfs-site.xml
vim hdfs-site.xml
创建目录:
mkdir -p /opt/hd_space/hdfs/name
mkdir -p /opt/hd_space/hdfs/data
<property>
<!-- 设置HDFS元数据文件存放路径 -->
<name>dfs.namenode.name.dir</name>
<value>/opt/hd_space/hdfs/name</value>
</property>
<property>
<!-- 设置HDFS数据文件存放路径 -->
<name>dfs.datanode.data.dir</name>
<value>/opt/hd_space/hdfs/data</value>
</property>
<property>
<!-- 设置HDFS数据文件副本数 -->
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<!-- 设置其他用户执行操作时会提醒没有权限 -->
<name>dfs.permissions</name>
<value>false</value>
</property>
4、配置文件:map-site.xml
没有则创建:vim map-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
5、配置文件:yarn-site.xml
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
6、配置文件:slaves
将localhost修改为hadoop
[root@hadoop hadoop]# vim slaves
[root@hadoop hadoop]# cat slaves
hadoop
五、格式化 HDFS
- 切入目录:/opt/hadoop/bin/
- 执行:
./hdfs namenode -format
[root@hadoop hadoop]# cd /opt/hadoop/bin/
[root@hadoop bin]# ls
container-executor hadoop hadoop.cmd hdfs hdfs.cmd mapred mapred.cmd rcc test-container-executor yarn yarn.cmd
[root@hadoop bin]# ./hdfs namenode -format
六、启动
1、切入目录:/opt/hadoop/sbin
[root@hadoop bin]# cd /opt/hadoop/sbin/
[root@hadoop sbin]# ls
distribute-exclude.sh httpfs.sh start-all.cmd start-secure-dns.sh stop-balancer.sh stop-yarn.sh
hadoop-daemon.sh kms.sh start-all.sh start-yarn.cmd stop-dfs.cmd yarn-daemon.sh
hadoop-daemons.sh mr-jobhistory-daemon.sh start-balancer.sh start-yarn.sh stop-dfs.sh yarn-daemons.sh
hdfs-config.cmd refresh-namenodes.sh start-dfs.cmd stop-all.cmd stop-secure-dns.sh
hdfs-config.sh slaves.sh start-dfs.sh stop-all.sh stop-yarn.cmd
2、启动hadoop
执行启动文件:./start-all.sh
- 通过 jps 命令查看启动的进程,命令执行后,如果可以看到 NameNode、DataNode、SecondaryNameNode、ResourceManager、NodeManager 进程,证明 Hadoop平台已经启动成功。
[root@hadoop sbin]# jps
1120 Jps
116946 ResourceManager
115910 NameNode
116104 DataNode
116682 SecondaryNameNode
117070 NodeManager
- 通过浏览器验证, HDFS的网址:http://hadoop:50070 ,YARN 的网址:http://hadoop:8088
七、总结
1、注意文件的配置,再复制粘贴过程中可能丢失字母,编辑完检查一遍
2、配置hadoop文件时名字不要写错,否则会新建一个错误文件
3、任何细节的代码不要错过