安装Hadoop单节点伪分布式集群
操作系统:Ubuntu server 20.04
参考文档:http://apache.github.io/hadoop/hadoop-project-dist/hadoop-common/SingleCluster.html
系统准备
开启SSH
系统支持SSH远程登陆.
如未安装可使用下面命令安装:
sudo apt-get install ssh
安装JDK
参考网址:https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Java+Versions
官方文档:
- Apache Hadoop 3.3 and upper supports Java 8 and Java 11 (runtime only)
- Please compile Hadoop with Java 8. Compiling Hadoop with Java 11 is not supported: HADOOP-16795 - Java 11 compile support OPEN
- Apache Hadoop from 3.0.x to 3.2.x now supports only Java 8
- Apache Hadoop from 2.7.x to 2.10.x support both Java 7 and 8
根据官方文档,目前支持JDK8
和JDK11
,我们安装JDK8
即可
操作步骤如下:
# 更新软件包
sudo apt-get update
sudo apt-get upgrade
# 安装JDK8
sudo apt install openjdk-8-jdk
# 验证是否安装成功
java -version
# 输出
openjdk version "1.8.0_362"
OpenJDK Runtime Environment (build 1.8.0_362-8u372-ga~us1-0ubuntu1~22.04-b09)
OpenJDK 64-Bit Server VM (build 25.362-b09, mixed mode)
查看JAVA安装信息
# 查看安装目录
which java
# 输出:/usr/bin/java
# 查看具体路径
ll /usr/bin/java
# 输出:... /usr/bin/java -> /etc/alternatives/java*
设置JAVA_HOME
# 查找系统中可用的JAVA版本
update-alternatives --config java
# 输出:
# There is only one alternative in link group java (providing /usr/bin/java): /usr/lib/jvm/java-8-ope