Centos7 部署伪分布式 hadoop 单机版
1、准备工作
1、下载hadoop tar 包, hadoop 各个版本对应的下载地址
选在自己对应的版本,我这里使用的是 hadoop-2.8.2.tar.gz
2、安装好对应的JDK
可以参考 Centos7 JDK 安装步骤
3、关闭防火墙
可以参考 Centos7 防火墙启动停止
2、安装 hadoop
- 上传至服务器
- 解压 hadoop
[root@syq-jtj-jzjxyth-yycx3 software]# tar -zxvf hadoop-2.8.2.tar.gz
- 移动解压后的文件夹至 /opt/ 下 并将名称修改成hadoop
[root@syq-jtj-jzjxyth-yycx3 software]# mv hadoop-2.8.2 /opt/hadoop
[root@syq-jtj-jzjxyth-yycx3 software]#
- 配置环境变量
[root@syq-jtj-jzjxyth-yycx3 software]# vim /etc/profile
下拉到最后 输入一下配置 路径根据自己的路径修改
#Hadoop
export HADOOP_HOME=/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
source 一下环境变量使配置生效
[root@syq-jtj-jzjxyth-yycx3 software]# source /etc/profile
3、配置hadoop-env.sh
1、切换到 /opt/hadoop/etc/hadoop/ 目录下
[root@syq-jtj-jzjxyth-yycx3 opt]# cd /opt/hadoop/etc/hadoop/
[root@syq-jtj-jzjxyth-yycx3 hadoop]# ls
capacity-scheduler.xml hadoop-env.sh httpfs-env.sh kms-env.sh mapred-env.sh ssl-server.xml.example
configuration.xsl hadoop-metrics2.properties httpfs-log4j.properties kms-log4j.properties mapred-queues.xml.template yarn-env.cmd
container-executor.cfg hadoop-metrics.properties httpfs-signature.secret kms-site.xml mapred-site.xml.template yarn-env.sh
core-site.xml hadoop-policy.xml httpfs-site.xml log4j.properties slaves yarn-site.xml
hadoop-env.cmd hdfs-site.xml kms-acls.xml mapred-env.cmd ssl-client.xml.example
2、找到 hadoop-env.sh vim 打开文件
[root@syq-jtj-jzjxyth-yycx3 hadoop]# vim hadoop-env.sh
找到 以下
# The java implementation to use.
export JAVA_HOME=${JAVA_HOME}
将 ${JAVA_HOME} 替换成自己系统上的jdk 路径 如下:
export JAVA_HOME=/usr/jdk1.8
保存退出
4、配置core-site.xml 加入以下配置
<configuration>
<!-- 指定HDFS老大(namenode)的通信地址 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://自己机器域名或者ip:9000</value>
</property>
<!-- 指定hadoop运行时产生文件的存储路径,路径需要自己提前创建 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/hadoop/tmp</value>
</property>
</configuration>
5、配置 hdfs-site.xml 加入以下配置
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value> <!-- 设置hdfs副本数量 如果是伪分布模式,此值是1-->
</property>
<property>
<name>dfs.permissions</name>
<value>false</value> <!--设置hdfs的操作权限,false表示任何用户都可以在hdfs上操作文件-->
</property>
</configuration>
6、SSH免密码登录
输入 ssh-keygen -t rsa 然后一直回车
[root@syq-jtj-jzjxyth-yycx3 hadoop]# ssh-keygen -t rsa
会在/root/.ssh/ 的路径下生成 秘钥文件 id_rsa(私钥)、id_rsa.pub(公钥)
[root@syq-jtj-jzjxyth-yycx3 /]# cd root/.ssh/
[root@syq-jtj-jzjxyth-yycx3 .ssh]# ls
id_rsa id_rsa.pub
[root@syq-jtj-jzjxyth-yycx3 .ssh]#
输入ssh-copy-id 自己机器域名 (注意localhost也可以改成虚拟机的名称) ,第一次运行都需要输入一次自己的密码,完成后输入ls,会看到多了一个用户权限,即免密码的权限。authorized_keys 其内容就是密码值,此时ssh访问就不需要密码了。
[root@syq-jtj-jzjxyth-yycx3 .ssh]# ssh-copy-id 自己机器域名
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'syq-jtj-jzjxyth-yycx3 (10.75.8.44)' can't be established.
ECDSA key fingerprint is SHA256:8oEESzJaJeJiywAraYPiYGKz8lBJwK8jTuQHsqgAJa4.
ECDSA key fingerprint is MD5:78:77:ac:1f:69:bf:29:5f:33:68:e0:46:5b:2e:f1:bc.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@syq-jtj-jzjxyth-yycx3's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'syq-jtj-jzjxyth-yycx3'"
and check to make sure that only the key(s) you wanted were added.
验证免密是否成功
[root@syq-jtj-jzjxyth-yycx3 ~]# ssh syq-jtj-jzjxyth-yycx3
Last login: Sat Apr 2 05:06:09 2022 from localhost
[root@syq-jtj-jzjxyth-yycx3 ~]#
7、hdfs启动与停止
第一次启动hdfs需要格式化,之后启动就不需要的:
[root@syq-jtj-jzjxyth-yycx3 ~]# cd /opt/hadoop/
格式化
[root@syq-jtj-jzjxyth-yycx3 hadoop]# ./bin/hdfs namenode -format
启动 hdfs
[root@syq-jtj-jzjxyth-yycx3 hadoop]# ./sbin/start-dfs.sh
Starting namenodes on [syq-jtj-jzjxyth-yycx3]
syq-jtj-jzjxyth-yycx3: starting namenode, logging to /opt/hadoop/logs/hadoop-root-namenode-syq-jtj-jzjxyth-yycx3.out
localhost: starting datanode, logging to /opt/hadoop/logs/hadoop-root-datanode-syq-jtj-jzjxyth-yycx3.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is SHA256:8oEESzJaJeJiywAraYPiYGKz8lBJwK8jTuQHsqgAJa4.
ECDSA key fingerprint is MD5:78:77:ac:1f:69:bf:29:5f:33:68:e0:46:5b:2e:f1:bc.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /opt/hadoop/logs/hadoop-root-secondarynamenode-syq-jtj-jzjxyth-yycx3.out
查看启动启动
[root@syq-jtj-jzjxyth-yycx3 hadoop]# jps
22130 NameNode
22468 DataNode
24357 Jps
63398 nacos-server.jar
23864 SecondaryNameNode
[root@syq-jtj-jzjxyth-yycx3 hadoop]#
浏览器访问 服务器地址 加上 50070 端口 注意一定要关闭防火墙,否则可能访问不到,出现以下界面表示 部署成功
8、配置yarn文件
配置 /opt/hadoop/etc/hadoop/mapred-site.xml
需要注意一下,
hadoop里面默认是mapred-site.xml.template 文件,
如果配置yarn,把mapred-site.xml.template 重命名为mapred-site.xml 。
如果不启动yarn,把重命名还原。
将 mapred-site.xml.template 重名 为 mapred-site.xml
[root@syq-jtj-jzjxyth-yycx3 hadoop]# cp mapred-site.xml.template mapred-site.xml
然后编辑 mapred-site.xml 加入一下配置
<configuration>
<property>
<!--指定mapreduce运行在yarn上-->
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
配置/opt/hadoop/etc/hadoop/yarn-site.xml文件
配置 yarn-site.xml 加入以下配置
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<!--指定yarn的老大 resoucemanager的地址-->
<name>yarn.resourcemanager.hostname</name>:
<value>机器域名或者ip</value>
</property>
<property>
<!--NodeManager获取数据的方式 mapreduce_shuffle-->
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
9、启动停止 yarn
启动 yarn
[root@syq-jtj-jzjxyth-yycx3 hadoop]# ./sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop/logs/yarn-root-resourcemanager-syq-jtj-jzjxyth-yycx3.out
localhost: starting nodemanager, logging to /opt/hadoop/logs/yarn-root-nodemanager-syq-jtj-jzjxyth-yycx3.out
jps 查看 发现多一个 ResourceManager 进程 表示启动成功
[root@syq-jtj-jzjxyth-yycx3 hadoop]# jps
48929 ResourceManager
22130 NameNode
22468 DataNode
63398 nacos-server.jar
49094 NodeManager
49718 Jps
23864 SecondaryNameNode
[root@syq-jtj-jzjxyth-yycx3 hadoop]#
停止 yarn
[root@syq-jtj-jzjxyth-yycx3 hadoop]# ./sbin/stop-yarn.sh
网页访问一下
到此 hadoop 单机版 伪分布式 部署结束。