hadoop之hdfs搭建

一、Hadoop

1.1狭义:apache hadoop软件 开源的
1.2广义:以apache hadoop软件为主的生态圈(hive sqoop flume spark flink hbase 。。。。)
常见的apache大数据的网址:主件.apache.org
hadoop.apache.org
xxx.apache.org
spark.apache.org
kafka.apache.org

二、hadoop软件

apache hadoop软件:
1.x 基本不用
2.x 企业主流==》CDH5.x系列
3.x 尝试使用==》CDH6.x系列

http://archive.cloudera.com/cdh5/cdh/5/
http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.16.2.tar.gz

hadoop-2.6.0-cdh5.16.2.tar.gz
apache hadoop2.6.0 + 以后的patch==apache hadoop2.9(相当于补丁+Apache Hadoop2.9)
CDH5.14.0 hadoop-2.6.0
CDH5.16.2 hadoop-2.6.0
http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.16.2-changes.log
http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.16.2-changes.log
cloudera
选择cdh的好处:版本兼容性 不必考虑

三、Hadoop的组成

hdfs 存储
mapreduce 计算 作业 有价值的 数据挖掘 ==》
由于开发难度高 代码量大 维护困难 计算慢,
所以大家基本不会使用MR,都使用hive sql spark flink
yarn 资源(内存 VCORE)+作业调度

海量的数据 1000台

hadoop-2.6.0-cdh5.16.2.tar.gz

wget http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.16.2.tar.gz

四、部署

三种部署模式

Now you are ready to start your Hadoop cluster in one of the three supported modes:
Local (Standalone) Mode 本地模式 不用
Pseudo-Distributed Mode 伪分布式模式 学习 测试 1台
Fully-Distributed Mode 分布式模型 集群模式 生产

环境要求

Required Software:
a.java
mkdir /usr/java
cd /usr/java
[root@pxj31 /root]#cd /usr/java/
[root@pxj31 /usr/java]#ll
总用量 0
drwxr-xr-x. 8 root root 255 11月 16 15:42 jdk1.8.0_121

创建用户 目录

[root@pxj31 /root]#useradd pxj
[root@pxj31 /root]#id pxj
uid=1000(pxj) gid=1000(pxj)=1000(pxj)
[root@pxj31 /root]#su - pxj
[pxj@pxj31 /home/pxj]$mkdir app software sourcecode log tmp data lib
[pxj@pxj31 /home/pxj]$ll
总用量 0
drwxrwxr-x. 2 pxj pxj 6 12月 1 00:05 app                          解压的文件夹 软连接
drwxrwxr-x. 2 pxj pxj 6 12月 1 00:05 data                         数据
drwxrwxr-x. 2 pxj pxj 6 12月 1 00:05 lib                            第三方的jar
drwxrwxr-x. 2 pxj pxj 6 12月 1 00:05 log                           日志文件夹
drwxrwxr-x. 2 pxj pxj 6 12月 1 00:05 software                  压缩包
drwxrwxr-x. 2 pxj pxj 6 12月 1 00:05 sourcecode             源代码编译
drwxrwxr-x. 2 pxj pxj 6 12月 1 00:05 tmp                          临时文件夹 ???/tmp

上传压缩包

[

pxj@pxj31 /home/pxj/software]$ll
总用量 424176
-rw-r--r--. 1 root root 434354462 11月 30 23:44 hadoop-2.6.0-cdh5.16.2.tar.gz
[pxj@pxj31 /home/pxj]$chown pxj:pxj -R /home/pxj/software/*
解压

[pxj@pxj31 /home/pxj/software]$tar -zxvf hadoop-2.6.0-cdh5.16.2.tar.gz  -C ../app/
hadoop-2.6.0-cdh5.16.2/
hadoop-2.6.0-cdh5.16.2/share/
hadoop-2.6.0-cdh5.16.2/share/hadoop/
hadoop-2.6.0-cdh5.16.2/share/hadoop/common/
hadoop-2.6.0-cdh5.16.2/share/hadoop/common/sources/
hadoop-2.6.0-cdh5.16.2/share/hadoop/common/sources/hadoop-common-2.6.0-cdh5.16.2-sources.jar
[pxj@pxj31 /home/pxj/app]$ll
总用量 0
drwxr-xr-x. 3 pxj pxj 19 6月   3 19:11 hadoop-2.6.0-cdh5.16.2

做软连接

[pxj@pxj31 /home/pxj/app]$ln -s hadoop-2.6.0-cdh5.16.2 hadoop
[pxj@pxj31 /home/pxj/app]$ll
总用量 0
lrwxrwxrwx. 1 pxj pxj 22 12月  1 00:22 hadoop -> hadoop-2.6.0-cdh5.16.2
drwxr-xr-x. 3 pxj pxj 19 6月   3 19:11 hadoop-2.6.0-cdh5.16.2

官方参考文档:
https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/SingleCluster.html
http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.16.2/hadoop-project-dist/hadoop-common/SingleCluster.html

JAVA_HOME 显性配置

[pxj@pxj31 /home/pxj/app/hadoop/etc/hadoop]$vi hadoop-env.sh 
export JAVA_HOME=/usr/java/jdk1.8.0_121
配置IP地址

[pxj@pxj31 /home/pxj/app/hadoop/etc/hadoop]$cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.25.31 pxj31

ssh无密码信任关系

1.生成公钥和私钥
[pxj@pxj31 /home/pxj]$ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/pxj/.ssh/id_rsa): 
Created directory '/home/pxj/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/pxj/.ssh/id_rsa.
Your public key has been saved in /home/pxj/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:Y6ePUjG4QGO+wPvlT7Sp2+cD7poMsbPOeaPWppYVuzQ pxj@pxj31
The key's randomart image is:
+---[RSA 2048]----+
|                 |
|    +            |
| . + . .         |
|  o o o o        |
|   o.o +So.      |
|  . .oEoo*       |
|   .+B ==.       |
|   .=*O=.oo      |
|   +B=*B=oo.     |
+----[SHA256]-----+
2.导入公钥到认证文件
$ [pxj@pxj31 /home/pxj]$cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys 
3.权限
[pxj@pxj31 /home/pxj]$chmod 600 ~/.ssh/authorized_keys
4.测试
[pxj@pxj31 /home/pxj]$ssh pxj31 date
2019年 12月 01日 星期日 01:07:55 CST
参考文章:
http://blog.itpub.net/30089851/viewspace-2127102/  故障
http://blog.itpub.net/30089851/viewspace-1992210/  ssh多台  坑
 

## 配置用户环境变量

```shell
[pxj@pxj31 /home/pxj/app/hadoop]$vim ~/.bashrc 
.bashrc
 Source global definitions
if [ -f /etc/bashrc ]; then
        . /etc/bashrc
fi
Uncomment the following line if you don't like systemctl's auto-paging feature:
 export SYSTEMD_PAGER=
 User specific aliases and functions
export HADOOP_HOME=/home/ruoze/app/hadoop
export PATH=${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH
[pxj@pxj31 /home/pxj/app/hadoop]$source ~/.bashrc 
[pxj@pxj31 /home/pxj]$which hadoop
~/app/hadoop/bin/hadoop

修改配置文件

[pxj@pxj31 /home/pxj/app/hadoop/etc/hadoop]$vim core-site.xml
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://pxj31:9000</value>
    </property>
</configuration>
[pxj@pxj31 /home/pxj/app/hadoop/etc/hadoop]$vim hdfs-site.xml
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>
格式化

[pxj@pxj31 /home/pxj]$hdfs namenode -format
has been successfully formatted.

第一次启动

[pxj@pxj31 /home/pxj]$start-dfs.sh
19/12/01 01:29:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [pxj31]
pxj31: starting namenode, logging to /home/pxj/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-pxj-namenode-pxj31.out
The authenticity of host 'localhost (::1)' can't be established.
ECDSA key fingerprint is SHA256:wBnpszXvvVt7NH/NuxDRgLHkCXU1CStTXPflPsQw1AI.
ECDSA key fingerprint is MD5:7e:92:4e:a6:a7:65:93:43:b6:b2:53:a3:48:14:0a:ae.
Are you sure you want to continue connecting (yes/no)? yes
localhost: Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
localhost: starting datanode, logging to /home/pxj/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-pxj-datanode-pxj31.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is SHA256:wBnpszXvvVt7NH/NuxDRgLHkCXU1CStTXPflPsQw1AI.
ECDSA key fingerprint is MD5:7e:92:4e:a6:a7:65:93:43:b6:b2:53:a3:48:14:0a:ae.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /home/pxj/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-pxj-secondarynamenode-pxj31.out
19/12/01 01:30:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[pxj@pxj31 /home/pxj]$jps
13955 SecondaryNameNode
14071 Jps
13641 NameNode
13773 DataNode

坑:

[pxj@pxj31 /home/pxj/.ssh]$cat known_hosts
pxj31,192.168.25.31 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBCuE4mM++/m5+KufqPqfoulxSKCvNQu5obqsULglJD5aGgDapf/61g16DqiHdlqUYFjiey7dRTFrO+qkT+IXMA0=
localhost ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBCuE4mM++/m5+KufqPqfoulxSKCvNQu5obqsULglJD5aGgDapf/61g16DqiHdlqUYFjiey7dRTFrO+qkT+IXMA0=
0.0.0.0 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBCuE4mM++/m5+KufqPqfoulxSKCvNQu5obqsULglJD5aGgDapf/61g16DqiHdlqUYFjiey7dRTFrO+qkT+IXMA0=
DN SNN都以pxj31启动

pxj31: starting namenode, logging to /home/pxj/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-pxj-namenode-pxj31.out
localhost: starting datanode, logging to /home/pxj/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-pxj-datanode-pxj31.out
0.0.0.0: starting secondarynamenode, logging to /home/pxj/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-pxj-secondarynamenode-pxj31.out
NN:ruozedata001 fs.defaultFS控制的
DN: slaves文件

修正

[pxj@pxj31 /home/pxj/app/hadoop/etc/hadoop]$vim core-site.xml 
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://pxj31:9000</value>
    </property>
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>pxj31:50090</value>
     </property>
     <property>
        <name>dfs.namenode.secondary.https-address</name>
        <value>pxj31:50091</value>
    </property>
</configuration>
[pxj@pxj31 /home/pxj/app/hadoop/etc/hadoop]$vim slaves
[pxj@pxj31 /home/pxj/app/hadoop/etc/hadoop]$cat slaves 
pxj31

namenode 名称节点 老大 读写请求先经过它 主节点
datanode 数据节点 小弟 存储数据 检索数据 从节点
secondarynamenode 第二名称节点 老二 h+1
在这里插入图片描述

hadoop相关命令

创建文件夹
[pxj@pxj31 /home/pxj/app/hadoop/etc/hadoop]$hadoop fs -mkdir /a
19/12/01 01:47:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
查看
[pxj@pxj31 /home/pxj/app/hadoop/etc/hadoop]$hadoop fs -ls /
19/12/01 01:47:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
drwxr-xr-x - pxj supergroup 0 2019-12-01 01:47 /a
上传文件
[pxj@pxj31 /home/pxj/app/hadoop/etc/hadoop]$hadoop fs -put slaves /a
19/12/01 01:48:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[pxj@pxj31 /home/pxj/app/hadoop/etc/hadoop]$hadoop fs -ls /a
19/12/01 01:48:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
-rw-r--r--   1 pxj supergroup          6 2019-12-01 01:48 /a/slaves
下载文件
[pxj@pxj31 /home/pxj/app/hadoop/etc/hadoop]$hadoop  fs -get /a/slaves /home/pxj/slaves1
19/12/01 01:49:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[pxj@pxj31 /home/pxj/app/hadoop/etc/hadoop]$
[pxj@pxj31 /home/pxj]$ll
总用量 4
drwxrwxr-x. 3 pxj pxj 50 12月  1 00:40 app
drwxrwxr-x. 2 pxj pxj  6 12月  1 00:05 data
drwxrwxr-x. 2 pxj pxj  6 12月  1 00:05 lib
drwxrwxr-x. 2 pxj pxj  6 12月  1 00:05 log
-rw-r--r--. 1 pxj pxj  6 12月  1 01:49 slaves1
drwxrwxr-x. 2 pxj pxj 43 12月  1 00:37 software
drwxrwxr-x. 2 pxj pxj  6 12月  1 00:05 sourcecode
drwxrwxr-x. 2 pxj pxj  6 12月  1 00:05 tmp
删除文件
[pxj@pxj31 /home/pxj]$hadoop fs -rm /a/slaves
19/12/01 01:51:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Deleted /a/slaves
删除文件夹
[pxj@pxj31 /home/pxj]$hadoop fs -rmdir /a
19/12/01 01:54:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[pxj@pxj31 /home/pxj]$
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值