企业项目实战hadoop篇---hadoop安装部署(一)

一.hadoop简介

HDFS是一个高度容错性的分布式文件系统,可以被广泛的部署于廉价的PC上。它以流式访问模式访问应用程序的数据,这大大提高了整个系统的数据吞吐量,因而非常适合用于具有超大数据集的应用程序中。
HDFS的架构如图所示。HDFS架构采用主从架构(master/slave)。一个典型的HDFS集群包含一个NameNode节点和多个DataNode节点。NameNode节点负责整个HDFS文件系统中的文件的元数据的保管和管理,集群中通常只有一台机器上运行NameNode实例,DataNode节点保存文件中的数据,集群中的机器分别运行一个DataNode实例。在HDFS中,NameNode节点被称为名称节点,DataNode节点被称为数据节点。DataNode节点通过心跳机制与NameNode节点进行定时的通信。

在这里插入图片描述
分布式存储系统HDFS (Hadoop Distributed File System )POSIX 分布式存储系统 提供了 高可靠性、高扩展性和高吞吐率的数据存储服务 分布式计算框架MapReduce 分布式计算框架(计算向数据移动) 具有 易于编程、高容错性和高扩展性等优点。 分布式资源管理框架YARN(Yet Another Resource Management) 负责集群资源的管理和调度。

二.安装hadoop

准备hadoop软件压缩包及提供支持的jdk压缩包

[root@server1 ~]# ls
anaconda-ks.cfg  hadoop-3.2.1.tar.gz  jdk-8u181-linux-x64.rpm

创建hadoop用户,将压缩文件移动至hadoop用户家目录下

[root@server1 ~]# useradd hadoop
[root@server1 ~]# mv * /home/hadoop/
[root@server1 ~]# su - hadoop 
[hadoop@server1 ~]$ ls
anaconda-ks.cfg  hadoop-3.2.1.tar.gz  jdk-8u181-linux-x64.rpm

解压压缩文件,创建java hadoop软连接

[hadoop@server1 ~]$ ls
anaconda-ks.cfg  hadoop-3.2.1.tar.gz  jdk-8u181-linux-x64.tar.gz
[hadoop@server1 ~]$ tar zxf jdk-8u181-linux-x64.tar.gz 
[hadoop@server1 ~]$ ls
anaconda-ks.cfg  hadoop-3.2.1.tar.gz  jdk1.8.0_181  jdk-8u181-linux-x64.tar.gz
[hadoop@server1 ~]$ ln -s jdk1.8.0_181/ java
[hadoop@server1 ~]$ ls
anaconda-ks.cfg      java          jdk-8u181-linux-x64.tar.gz
hadoop-3.2.1.tar.gz  jdk1.8.0_181
[hadoop@server1 ~]$ tar zxf hadoop-3.2.1.tar.gz 
[hadoop@server1 ~]$ ls
anaconda-ks.cfg  hadoop-3.2.1.tar.gz  jdk1.8.0_181
hadoop-3.2.1     java                 jdk-8u181-linux-x64.tar.gz
[hadoop@server1 ~]$ ln -s hadoop-3.2.1 hadoop
[hadoop@server1 ~]$ ls
anaconda-ks.cfg  hadoop-3.2.1         java          jdk-8u181-linux-x64.tar.gz
hadoop           hadoop-3.2.1.tar.gz  jdk1.8.0_181

在这里插入图片描述
进入hadoop目录,配置hadoop环境,指定hadoop及java位置

[hadoop@server1 ~]$ cd hadoop
[hadoop@server1 hadoop]$ cd etc/hadoop/
[hadoop@server1 hadoop]$ vim hadoop-env.sh 

在这里插入图片描述
创建input目录,拷贝etc/haadoop/*.xml文件
在这里插入图片描述
执行测试命令,使用hadoop-mapreduce-example示例文件
在这里插入图片描述
测试命令,抓取input目录中含dfs的字段,并保存到output目录下

[hadoop@server1 hadoop]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar  grep input output 'dfs[a-z.]+'

在这里插入图片描述
查看output内容
在这里插入图片描述

三.部署伪分布式hadoop

指定worker localhost

[hadoop@server1 hadoop]$ cat workers 
localhost

修改core-site.xml ,设置localhost:9000访问hdfs

[hadoop@server1 hadoop]$ cd /hadoop/etc/hadoop/
[hadoop@server1 hadoop]$ vim core-site.xml 

在这里插入图片描述

修改hdfs-site.xml,设置分布节点数为1

[hadoop@server1 hadoop]$ vim hdfs-site.xml 

在这里插入图片描述
生成ssh密钥

[hadoop@server1 hadoop]$ ssh-keygen 
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): 
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:ROXTsvFYZX+cgN+n5imW+Jtc75WXm0FIjDPZCuaZGkU hadoop@server1
The key's randomart image is:
+---[RSA 2048]----+
|        .E. ..o  |
|       ... o=o.o.|
|        .+==++..+|
|       .+ +O=...o|
|       .S+o... o.|
|        o     + o|
|       .   . +.=o|
|          ..+oo.*|
|           o=o +o|
+----[SHA256]-----+

设置hadoop密码为westos

[root@server1 ~]# passwd hadoop
Changing password for user hadoop.
New password: 
BAD PASSWORD: The password is shorter than 8 characters
Retype new password: 
passwd: all authentication tokens updated successfully.

设置localhost免密登陆

[hadoop@server1 ~]$ ssh-copy-id  loaclhost
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed

/bin/ssh-copy-id: ERROR: ssh: Could not resolve hostname loaclhost: Name or service not known

[hadoop@server1 ~]$ ssh-copy-id  localhost
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
The authenticity of host 'localhost (::1)' can't be established.
ECDSA key fingerprint is SHA256:Qwz6cDDE7GvLYqOWEwNiW4Wf8PBLrLVAYYuHmU8d9Ds.
ECDSA key fingerprint is MD5:f7:84:ee:41:4e:97:1b:f3:28:d7:f5:63:71:d0:6b:06.
Are you sure you want to continue connecting (yes/no)? yes
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
hadoop@localhost's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'localhost'"
and check to make sure that only the key(s) you wanted were added.

测试免密登陆

[hadoop@server1 ~]$ ssh localhost
Last login: Sat Aug 14 23:13:02 2021
[hadoop@server1 ~]$ logout
Connection to localhost closed.

hadoop初始化

[hadoop@server1 ~]$ cd hadoop
[hadoop@server1 hadoop]$ bin/hdfs namenode -format
WARNING: /home/hadoop/hadoop/logs does not exist. Creating.
2021-08-14 23:14:41,913 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = server1/172.25.3.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 3.2.1
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at server1/172.25.3.1
************************************************************/

执行hdfs启动脚本

[hadoop@server1 hadoop]$ sbin/start-dfs.sh
Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [server1]
server1: Warning: Permanently added 'server1,172.25.3.1' (ECDSA) to the list of known hosts.

jps命令放入全局变量,执行jps查看hadoop部署信息

[hadoop@server1 ~]$ vim .bash_profile 
[hadoop@server1 ~]$ source .bash_profile

在这里插入图片描述

在这里插入图片描述

访问172.25.3.1:9870

在这里插入图片描述
可以查看到节点信息
在这里插入图片描述
节点信息
在这里插入图片描述
测试,创建 分布式用户目录/user/hadoop,导入input

[hadoop@server1 hadoop]$ bin/hdfs dfs -mkdir /user
[hadoop@server1 hadoop]$ bin/hdfs dfs -mkdir /user/hadoop
[hadoop@server1 hadoop]$ id
uid=1000(hadoop) gid=1000(hadoop) groups=1000(hadoop)
[hadoop@server1 hadoop]$ bin/hdfs dfs -ls
[hadoop@server1 hadoop]$ bin/hdfs dfs -put input/
2021-08-14 23:24:16,978 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-08-14 23:24:17,575 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-08-14 23:24:17,599 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-08-14 23:24:18,029 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-08-14 23:24:18,055 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-08-14 23:24:18,501 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-08-14 23:24:18,931 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-08-14 23:24:18,959 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2021-08-14 23:24:18,988 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false

查看导入目录

[hadoop@server1 hadoop]$ bin/hdfs dfs -ls input/
Found 9 items
-rw-r--r--   1 hadoop supergroup       8260 2021-08-14 23:24 input/capacity-scheduler.xml
-rw-r--r--   1 hadoop supergroup        774 2021-08-14 23:24 input/core-site.xml
-rw-r--r--   1 hadoop supergroup      11392 2021-08-14 23:24 input/hadoop-policy.xml
-rw-r--r--   1 hadoop supergroup        775 2021-08-14 23:24 input/hdfs-site.xml
-rw-r--r--   1 hadoop supergroup        620 2021-08-14 23:24 input/httpfs-site.xml
-rw-r--r--   1 hadoop supergroup       3518 2021-08-14 23:24 input/kms-acls.xml
-rw-r--r--   1 hadoop supergroup        682 2021-08-14 23:24 input/kms-site.xml
-rw-r--r--   1 hadoop supergroup        758 2021-08-14 23:24 input/mapred-site.xml
-rw-r--r--   1 hadoop supergroup        690 2021-08-14 23:24 input/yarn-site.xml

在这里插入图片描述
在这里插入图片描述

测试命令

[hadoop@server1 hadoop]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar  wordcount input output
[hadoop@server1 hadoop]$ bin/hdfs dfs -ls output
Found 2 items
-rw-r--r--   1 hadoop supergroup          0 2021-08-14 23:29 output/_SUCCESS
-rw-r--r--   1 hadoop supergroup       9351 2021-08-14 23:29 output/part-r-00000

在这里插入图片描述
在这里插入图片描述

四.部署分布式hadoop

执行脚本,停止hdfs !!!!!

[hadoop@server1 hadoop]$ sbin/stop-dfs.sh 

server1/2/3安装nfs

[hadoop@server1 hadoop]$ yum install -y nfs-utils

server2/3创建hadoop用户
在这里插入图片描述

server1配置网络共享 /home/hadoop

[root@server1 ~]# cat /etc/exports
/home/hadoop    *(rw,anonuid=1000,anongid=1000)

在这里插入图片描述

[root@server1 ~]# vim /etc/exports
[root@server1 ~]# systemctl  enable  --now nfs
Created symlink from /etc/systemd/system/multi-user.target.wants/nfs-server.service to /usr/lib/systemd/system/nfs-server.service.
[root@server1 ~]# showmount -e
Export list for server1:
/home/hadoop *

挂载172.25.3.1:/home/hadoop/ 到hadpood用户家目录下

[root@server5 ~]# mount 172.25.3.1:/home/hadoop/ /home/hadoop/

在这里插入图片描述

server2/3挂载nfs后,主机可以免密登陆,测试免密

[hadoop@server1 hadoop]$ ssh server2
The authenticity of host 'server2 (172.25.3.2)' can't be established.
ECDSA key fingerprint is SHA256:Qwz6cDDE7GvLYqOWEwNiW4Wf8PBLrLVAYYuHmU8d9Ds.
ECDSA key fingerprint is MD5:f7:84:ee:41:4e:97:1b:f3:28:d7:f5:63:71:d0:6b:06.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'server2,172.25.3.2' (ECDSA) to the list of known hosts.
Last login: Sat Aug 14 23:58:51 2021
[hadoop@server2 ~]$ logout
Connection to server2 closed.
[hadoop@server1 hadoop]$ ssh server3
The authenticity of host 'server3 (172.25.3.3)' can't be established.
ECDSA key fingerprint is SHA256:Qwz6cDDE7GvLYqOWEwNiW4Wf8PBLrLVAYYuHmU8d9Ds.
ECDSA key fingerprint is MD5:f7:84:ee:41:4e:97:1b:f3:28:d7:f5:63:71:d0:6b:06.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'server3,172.25.3.3' (ECDSA) to the list of known hosts.
Last login: Sat Aug 14 23:59:08 2021
[hadoop@server3 ~]$ logout
Connection to server3 closed.

修改hadoop配置文件

hdfs主机 172.25.3.1:9000

[hadoop@server1 hadoop]$ vim core-site.xml 

在这里插入图片描述
分布式节点数设置为2

[hadoop@server1 hadoop]$ vim hdfs-site.xml 

在这里插入图片描述
worker指定server2/3

[hadoop@server1 hadoop]$ cat etc/hadoop/workers 
server2
server3

在这里插入图片描述

配置完毕,执行hadoop初始化

[hadoop@server1 hadoop]$ bin/hdfs namenode -format
2021-08-15 00:01:51,404 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = server1/172.25.3.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 3.2.1
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at server1/172.25.3.1
************************************************************/

执行hadoop启动脚本

[hadoop@server1 hadoop]$ sbin/start-dfs.sh 
Starting namenodes on [server1]
Starting datanodes
Starting secondary namenodes [server1]

查看server1信息 NameNode

[hadoop@server1 hadoop]$ jps
5155 SecondaryNameNode
4959 NameNode
5295 Jps

查看server2/3信息 DataNode

[hadoop@server2 ~]$ jps
3856 Jps
3793 DataNode

在这里插入图片描述
firefox浏览器查看节点信息
在这里插入图片描述
节点扩容,将server4加入节点

创建hadoop用户,安装nfs

[root@server4 ~]# useradd hadoop
[root@server4 ~]# id had
id: had: no such user
[root@server4 ~]# id hadoop
uid=1000(hadoop) gid=1000(hadoop) groups=1000(hadoop)
[root@server4 ~]# yum install  -y nfs-utils.x86_64 

挂载nfs

[root@server4 ~]#  mount 172.25.3.1:/home/hadoop/ /home/hadoop/
[root@server4 ~]# df
Filesystem              1K-blocks    Used Available Use% Mounted on
/dev/mapper/rhel-root    28289540 1162220  27127320   5% /
devtmpfs                   495424       0    495424   0% /dev
tmpfs                      507448       0    507448   0% /dev/shm
tmpfs                      507448    6968    500480   2% /run
tmpfs                      507448       0    507448   0% /sys/fs/cgroup
/dev/vda1                 1038336  135088    903248  14% /boot
tmpfs                      101492       0    101492   0% /run/user/0
172.25.3.1:/home/hadoop  28289792 2997248  25292544  11% /home/hadoop

server1内配置worker文件,加入server4节点

[hadoop@server1 hadoop]$ cat etc/hadoop/workers 
server2
server3
server4

在这里插入图片描述

server4内执行节点添加命令

[hadoop@server4 hadoop]$ bin/hdfs --daemon start datanode
[hadoop@server4 hadoop]$ jps
4220 Jps
4190 DataNode

server4成功添加
在这里插入图片描述

五. 部署分布式资源管理框架yarn

修改yarn-site.xml ,添加yarn模块内容

[hadoop@server1 hadoop]$ vim etc/hadoop/yarn-site.xml 

在这里插入图片描述
查看mapred-site.xml

[hadoop@server1 hadoop]$ vim etc/hadoop/mapred-site.xml 

在这里插入图片描述

添加环境变量

[hadoop@server1 hadoop]$ vim  etc/hadoop/hadoop-env.sh 

在这里插入图片描述
执行yarn启动脚本

[hadoop@server1 hadoop]$ sbin/start-yarn.sh
Starting resourcemanager
Starting nodemanagers
server4: Warning: Permanently added 'server4,172.25.3.4' (ECDSA) to the list of known hosts.

在这里插入图片描述

查看server1信息 Resourcemanager

[hadoop@server1 hadoop]$ jps
15667 Jps
4695 SecondaryNameNode
4506 NameNode
15563 ResourceManager

在这里插入图片描述
查看server2/3/4信息 nodemanager

[hadoop@server3 ~]$ jps
4323 NodeManager
4595 Jps

访问 172.25.3.1:8088,查看yarn管理页面
在这里插入图片描述

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值