Linux伪分布式安装Hadoop
1. 更新软件列表
hadoop@jeff:~$ sudo apt update
2. 安装vim编辑器
hadoop@jeff:/home/jeff$ sudo apt install -y vim
3. 安装SSH服务并配置免密登录
安装SSH服务
hadoop@jeff:/home/jeff$ sudo apt install -y openssh-server openssh-client
配置免密登录
hadoop@jeff:~$ cd ~/.ssh/
hadoop@jeff:~/.ssh$ ssh-keygen -t rsa
hadoop@jeff:~/.ssh$ cat id_rsa.pub >> authorized_keys
4. 安装JDK
下载JDK
hadoop@jeff:~$ sudo wget https://download.oracle.com/java/18/latest/jdk-18_linux-x64_bin.tar.gz
将jdk解压至/usr/local/lib目录下
hadoop@jeff:~$ sudo tar -zxvf jdk-18_linux-x64_bin.tar.gz -C /usr/local/lib/
更改jdk目录名
hadoop@jeff:/usr/local/lib$ sudo mv jdk-18 jdk
配置环境变量
hadoop@jeff:/usr/local/lib$ sudo vim /etc/profile
行末追加内容
JAVA_HOME = /usr/local/lib/jdk
CLASSPATH = $JAVA_HOME /lib
PATH = $PATH : $JAVA_HOME /bin
export PATH JAVA_HOME CLASSPATH
运行source /etc/profile 使文件生效
hadoop@jeff:/usr/local/lib$ source /etc/profile
查看Java版本
hadoop@jeff:/usr/local/lib$ java -version
java version "18" 2022 -03-22
Java( TM) SE Runtime Environment ( build 18 +36-2087)
Java HotSpot( TM) 64 -Bit Server VM ( build 18 +36-2087, mixed mode, sharing)
5. 安装Hadoop
下载Hadoop
hadoop@jeff:~$ sudo wget https://dlcdn.apache.org/hadoop/common/hadoop-3.2.3/hadoop-3.2.3.tar.gz
将Hadoop解压至/usr/local目录下
hadoop@jeff:~$ sudo tar -zxvf hadoop-3.2.3.tar.gz -C /usr/local/
更改hadoop目录名
hadoop@jeff:/usr/local$ sudo mv hadoop-3.2.3 hadoop
修改文件权限
hadoop@jeff:/usr/local/hadoop$ sudo chown -R hadoop /usr/local/hadoop/
添加环境变量
hadoop@jeff:/usr/local/hadoop$ sudo vim /etc/profile
行末追加内容
export HADOOP_HOME = /usr/local/hadoop
export PATH = $PATH : $HADOOP_HOME /bin
export PATH = $PATH : $HADOOP_HOME /sbin
运行source /etc/profile 使文件生效
hadoop@jeff:/usr/local/hadoop$ source /etc/profile
查看Hadoop版本
hadoop@jeff:/usr/local/hadoop$ hadoop version
Hadoop 3.2 .3
Source code repository https://github.com/apache/hadoop -r abe5358143720085498613d399be3bbf01e0f131
Compiled by ubuntu on 2022 -03-20T01:18Z
Compiled with protoc 2.5 .0
From source with checksum 39bb14faec14b3aa25388a6d7c345fe8
This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-3.2.3.jar
6. Hadoop伪分布式配置
core-site.xml配置内容如下
hadoop@jeff:/usr/local/hadoop/etc/hadoop$ sudo vim core-site.xml
< configuration>
< property>
< name> hadoop.dir.tmp< /name>
< value> file:/usr/local/hadoop/tmp< /value>
< /property>
< property>
< name> fs.defaultFS< /name>
< value> hdfs://localhost:9000 < /value>
< /property>
< /configuration>
hdfs-site.xml配置内容如下
hadoop@jeff:/usr/local/hadoop/etc/hadoop$ sudo vim hdfs-site.xml
< configuration>
< property>
< name> dfs.replication< /name>
< value> 1 < /value>
< /property>
< property>
< name> dfs.namenode.name.dir< /name>
< value> file:/usr/local/hadoop/tmp/dfs/name< /value>
< /property>
< property>
< name> dfs.datanode.data.dir< /name>
< value> file:/usr/local/hadoop/tmp/dfs/data< /value>
< /property>
< property>
< name> dfs.http.address< /name>
< value> 0.0 .0.0:50070 < /value>
< /property>
< /configuration>
执行NameNode的格式化
hadoop@jeff:/usr/local/hadoop$ ./bin/hdfs namenode -format
7. Hadoop启动
hadoop@jeff:/usr/local/hadoop$ ./sbin/start-dfs.sh
Starting namenodes on [ localhost]
Starting datanodes
Starting secondary namenodes [ jeff]
8. 查看进程
hadoop@jeff:/usr/local/hadoop$ jps
106993 DataNode
107211 SecondaryNameNode
106813 NameNode
107342 Jps
9. Hadoop停止
hadoop@jeff:/usr/local/hadoop$ ./sbin/stop-dfs.sh
Stopping namenodes on [ localhost]
Stopping datanodes
Stopping secondary namenodes [ jeff]