Hadoop的分布式部署是将每一个NameNode和DataNode都部署在独立的服务器上,然后做集群。伪分布式部署是将所有NameNode和DataNode部署在同一台机器上。
由于有些公司的业务并不是很大(大概一两百万行),或者没有充足的预算配置集群,可以采用伪分布式部署。
一、安装Hadoop
系统和运行环境:Linux系统、安装JDK
第一步:下载Hadoop和JDK
可以在Linux服务器上下载,或者下载后用FTP上传至Linux服务器。
Hadoop下载地址:https://dlcdn.apache.org/hadoop/common/hadoop-2.10.1/hadoop-2.10.1.tar.gz
百度云盘链接:https://pan.baidu.com/s/1n8PvXQH1bltM7lEQePrKIg
提取码:77oz
JDK下载地址:https://download.oracle.com/otn/java/jdk/8u321-b07/df5ad55fdd604472a86a45a217032c7d/jdk-8u321-linux-x64.tar.gz
百度云盘链接:链接:https://pan.baidu.com/s/1W072PXamWljiotCB2d1CVw
提取码:wmiu
第二步:解压tar.gz
//先切换到root用户
[abc@Hadoop_001 ~]$ su - root
//先创建解压的目的目录
[root@Hadoop_001 ~]# mkdir /app
[root@Hadoop_001 ~]# mkdir /app/jdk
[root@Hadoop_001 ~]# mkdir /app/hadoop
//解压jdk
[root@Hadoop_001 ~]# tar xvf jdk-8u321-linux-x64.tar.gz -C /app/jdk
[root@Hadoop_001 ~]# ls /app/jdk
jdk1.8.0_321
//配置JDK的环境变量
[root@Hadoop_001 ~]# vim /etc/profile
export JAVA_HOME=/app/jdk/jdk1.8.0_321
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
//验证JDK是否安装成功
[root@Hadoop_001 ~]# java -version
java version "1.7.0_45"
OpenJDK Runtime Environment (rhel-2.4.3.3.el6-x86_64 u45-b15)
OpenJDK 64-Bit Server VM (build 24.45-b08, mixed mode)
//解压Hadoop
//配置免密码登录。这里是伪分布式只有一台机器,配置自己的IP
//先生成密钥
root@Hadoop_001 ~]# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
29: