基础环境
Hadoop下载
1、官网:官网下载hadoop-2.7.1.tar.gz
2、旧版本下载(官方的archive地址):旧版本下载地址
3、清华大学开源软件镜像站下载(速度较快,只有新版本):清华大学开源软件镜像站
JDK下载
mkdir /usr/apps
#解压
tar -xf jdk-8u201-linux-x64.tar.gz -C /usr/apps/
tar -xf hadoop-2.7.1.tar.gz -C /usr/apps/
#添加JAVA环境变量 /etc/profile
export JAVA_HOME=/usr/apps/jdk1.8.0_201
export PATH=$PATH:$JAVA_HOME/bin
#添加Hadoop环境变量/etc/profile
export HADOOP_HOME=/usr/apps/hadoop-2.7.1
export PATH=$PATH:$HADOOP_HOME/bin
#修改hadoop的配置文件的环境变量/usr/apps/hadoop-2.7.1/etc/hadoop/hadoop-env.sh
cd /usr/apps/hadoop-2.7.1
vim etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/apps/jdk1.8.0_201
#ssh-keygen
ssh-keygen -t rsa
ssh-copy-id localhost
配置4个Hadoop的配置文件(/usr/apps/hadoop-2.7.1/etc/hadoop/下)
core-site.xml
<configuration>
<!-- 指定Hadoop所使用的文件系统schema(URI),HDFS的老大(NameNode)的地址 -->
<property>
<name>fs.defaultFS</name>
<!-- 指定客户端访问的主机名”master“,则该主机的hadoop就是namenode节点 -->
<value>hdfs://master:9000</value>
</property>
<!-- 指定hadoop运行时产生文件的存储目录,在Hadoop目录下新建一个data目录 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/apps/hadoop-2.7.1/data</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<!-- 指定HDFS副本的数量 -->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<!-- 指定mr运行在yarn上 -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<!-- 指定YARN的老大(ResourceManager) -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<!-- reducer获取数据的方式 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
格式化 HDFS 的 NameNode
hadoop namenode -format
启动 start-dfs
[14:07:50 root@master ~]# /usr/apps/hadoop-2.7.1/sbin/start-dfs.sh
Starting namenodes on [master]
The authenticity of host 'master (10.0.0.170)' can't be established.
ECDSA key fingerprint is SHA256:SCnJR0oM4EO0cKPNegjrE4jZnDLZr5i+xfyFK6AXBQU.
ECDSA key fingerprint is MD5:e2:c8:30:08:c9:09:5d:b6:62:5f:99:be:b9:9b:db:b6.
Are you sure you want to continue connecting (yes/no)? yes
master: Warning: Permanently added 'master,10.0.0.170' (ECDSA) to the list of known hosts.
root@master's password:
master: starting namenode, logging to /usr/apps/hadoop-2.7.1/logs/hadoop-root-namenode-centos7.out
The authenticity of host 'localhost (::1)' can't be established.
ECDSA key fingerprint is SHA256:SCnJR0oM4EO0cKPNegjrE4jZnDLZr5i+xfyFK6AXBQU.
ECDSA key fingerprint is MD5:e2:c8:30:08:c9:09:5d:b6:62:5f:99:be:b9:9b:db:b6.
Are you sure you want to continue connecting (yes/no)? yes
localhost: Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
root@localhost's password:
localhost: starting datanode, logging to /usr/apps/hadoop-2.7.1/logs/hadoop-root-datanode-centos7.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is SHA256:SCnJR0oM4EO0cKPNegjrE4jZnDLZr5i+xfyFK6AXBQU.
ECDSA key fingerprint is MD5:e2:c8:30:08:c9:09:5d:b6:62:5f:99:be:b9:9b:db:b6.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
root@0.0.0.0's password:
0.0.0.0: starting secondarynamenode, logging to /usr/apps/hadoop-2.7.1/logs/hadoop-root-secondarynamenode-centos7.out
#查看java进程
[14:05:38 root@centos7 hadoop-2.7.1]#jps
11448 SecondaryNameNode
11178 NameNode
11295 DataNode
11567 Jps
启动 start-yarn
[14:10:27 root@master hadoop-2.7.1]# /usr/apps/hadoop-2.7.1/sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/apps/hadoop-2.7.1/logs/yarn-root-resourcemanager-master.out
root@localhost's password:
localhost: starting nodemanager, logging to /usr/apps/hadoop-2.7.1/logs/yarn-root-nodemanager-master.out
![在这里插入图片描述](https://img-blog.csdnimg.cn/direct/c8d1b1c971c24522af9a17466311eaf7.png#pic_center)
#查看进程
[14:12:40 root@master hadoop-2.7.1]#jps
12034 Jps
11653 ResourceManager
11448 SecondaryNameNode
11178 NameNode
11295 DataNode
11935 NodeManager