大数据学习hadoop之hdfs部署

1. Hadoop的说法

  • 广义说法:以hadoop软件为主的生态圈
  • 狭义说法:hadoop软件

2. apache

apache基金会有很多的开源项目,包括hadoop、saprk、hive、flink等,其统一网址为xxx.apache.org,For Example

3. hadoop软件

其三个版本分别为1.x,2.x,3.x,其中2.x最为常用,3.x是新版本,不知道是否存在坑,企业上不做使用考虑,喜用2.x版本
hadoop三大模块

  • hdfs: 是存储大数据的分布式文件系统,做存储数据使用
  • mapreduce: 是分布式计算框架,用于计算,分为map端和reduce端,其使用再后面详细介绍
  • yarn: 是资源调度器,用于资源和作业的调度管理。
    我们使用的是cdh版本,其是cloudera公司将apache hadoop 2.6.0 的源代码经过修复bug、加入新功能之后编译为直接的版本,其两者使用上没有什么区别,只是在生产上使用cdh版的。

4. hdfs部署

4.1 创建无密码hadoop用户

创建hadoop用户用于专项管理,可通过命令useradd hadoop完成,在其家目录下创建文件夹app、source、software

[root@hadoop001 ~]# useradd hadoop
[root@hadoop001 ~]# vi /etc/sudoers
添加以下内容进去:
Hadoop hadoop  ALL=(ALL)  NOPASSWD:ALL

[root@hadoop001 ~]# su - hadoop
[hadoop@hadoop001 ~]$ ll
total 0
[hadoop@hadoop001 ~]$ mkdir app
[hadoop@hadoop001 ~]$ mkdir software
[hadoop@hadoop001 ~]$ mkdir source
[hadoop@hadoop001 ~]$ ll
total 12
drwxrwxr-x. 2 hadoop hadoop 4096 Sep 17 14:53 app
drwxrwxr-x. 2 hadoop hadoop 4096 Sep 17 14:54 software
drwxrwxr-x. 2 hadoop hadoop 4096 Sep 17 14:54 source
4.2 下载hdfs文件

切换到app文件夹下使用wget命令来下载hadoop的二进制包,下载好之后用命令tar -xzvf 压缩包进行解压
wget http://archive-primary.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.7.0.tar.gz

解压

[hadoop@hadoop001 app]$ tar -xzvf hadoop-2.6.0-cdh5.7.0.tar.gz
4.3 JAVA1.7部署
[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$ ll /usr/java/
total 319160
drwxr-xr-x 8 root root      4096 Apr 11  2015 jdk1.7.0_80
drwxr-xr-x 8 root root      4096 Apr 11  2015 jdk1.8.0_45
-rw-r--r-- 1 root root 153530841 Jul  8  2015 jdk-7u80-linux-x64.tar.gz
-rw-r--r-- 1 root root 173271626 Sep 19 11:49 jdk-8u45-linux-x64.gz
[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$ echo $JAVA_HOME
/usr/java/jdk1.7.0_80
[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$ 


[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$ which java
/usr/java/jdk1.7.0_80/bin/java
[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$ java -version
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)
[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$ 
4.4 准备
[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$ cd etc/hadoop
[hadoop@hadoop001 hadoop]$ vi hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_80
export HADOOP_PREFIX=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$ bin/hadoop
Usage: hadoop [--config confdir] COMMAND
       where COMMAND is one of:

启动三种模式

  • Local (Standalone) Mode: 单机 没有进程 不用
  • Pseudo-Distributed Mode: 伪分布式 1台机器 进程 学习
  • Fully-Distributed Mode: 分布式 进程 生产
4.5 配置文件

[hadoop@hadoop001 hadoop]$ vi core-site.xml
将以下内容加进去

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop002:9000</value>
    </property>
</configuration>   

[hadoop@hadoop001 hadoop]$ vi hdfs-site.xml
加入下面内容

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>
4.6 无密码ssh

一定要看清楚是在哪个用户哪个路径下输入命令的

[hadoop@hadoop001 hadoop]$ cd
[hadoop@hadoop001 ~]$ rm -rf .ssh
[hadoop@hadoop001 ~]$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
Generating public/private dsa key pair.
Created directory '/home/hadoop/.ssh'.
Your identification has been saved in /home/hadoop/.ssh/id_dsa.
Your public key has been saved in /home/hadoop/.ssh/id_dsa.pub.
The key fingerprint is:
10:b2:0c:fa:24:e7:81:b9:94:0f:f9:53:51:c9:7d:9e hadoop@hadoop001
The key's randomart image is:
+--[ DSA 1024]----+
|  . ..+.o        |
| +oo o.+ . .     |
|==+ o..   o .    |
|.B+..  .   E     |
|. o+    S        |
|    .            |
|                 |
|                 |
|                 |
+-----------------+
[hadoop@hadoop001 ~]$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
[hadoop@hadoop001 ~]$ cd .ssh
[hadoop@hadoop001 .ssh]$ ll
total 12
-rw-rw-r--. 1 hadoop hadoop 606 Sep 17 17:06 authorized_keys
-rw-------. 1 hadoop hadoop 668 Sep 17 17:06 id_dsa
-rw-r--r--. 1 hadoop hadoop 606 Sep 17 17:06 id_dsa.pub
[hadoop@hadoop001 .ssh]$ chmod 600 authorized_keys
[hadoop@hadoop001 .ssh]$ ssh hadoop001
The authenticity of host 'hadoop001 (192.168.137.190)' can't be established.
RSA key fingerprint is 09:9f:45:4e:60:17:91:57:95:f7:a4:1e:3b:2a:a9:bd.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'hadoop001,192.168.137.190' (RSA) to the list of known hosts.
Last login: Mon Sep 17 20:31:59 2018 from hadoop001
4.7 环境配置

[hadoop@hadoop001 ~]$ vi .bash_profile
添加以下内容到个人环境变量文件.bash_profile

export JAVA_HOME=/usr/java/jdk1.7.0_80
export HADOOP_PREFIX=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
export PATH=$HADOOP_PREFIX/bin:$JAVA_HOME/bin:$PATH

添加完成后保存退出,并生效个人环境变量文件
[hadoop@hadoop001 ~]$ sorrce .bash_profile

检查能否无密码ssh远程连接:

[hadoop@hadoop001 ~]$ ssh hadoop001
Last login: Tue Sep 18 06:38:19 2018 from hadoop001
[hadoop@hadoop001 ~]$ which hdfs
~/app/hadoop-2.6.0-cdh5.7.0/bin/hdfs

配置slaves

[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$ cd etc/hadoop
[hadoop@hadoop001 hadoop]$ vi slaves

将localhost改成hadoop001

格式化

[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$ bin/hdfs namenode -format

看到Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted表明格式化成功。
启动

[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$ sbin/start-dfs.sh

查看启动是否成功

[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$ jps
24398 Jps
24110 DataNode
24300 SecondaryNameNode
24013 NameNode

出现上面结果,则表明DataNode、SecondaryNameNode和NameNode三个进程都启动了

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值