Hadoop单节点的安装部署
Hadoop单节点的安装部署
1.hadoop
广义:以apache hadoop软件为主的生态圈(hive sqoop flume spark flink hbase)
狭义:apache hadoop软件 开源的
apache hadoop软件:
1.x 基本不用
2.x 企业主流 CDH5.x系列
3.x 尝试使用 CDH6.x系列
下载地址:
http://archive.cloudera.com/cdh5/cdh/5/
http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.16.2.tar.gz
选择cdh的好处:版本兼容性 不必考虑
2.hadoop软件
hdfs 存储
mapreduce 计算有价值的数据挖掘,由于开发难度高,代码量大,维护困难计算慢,
所以大家基本不会使用MR,都使用hive sql spark flink
yarn 资源(内存 VCORE)+作业调
海量的数据 1000台
也可以通过云端下载
wget http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.16.2.tar.gz
百度云提供 rz上传
3.部署
3.1 创建用户 目录
[root@ruozedata001 ~]# useradd ruoze
[root@ruozedata001 ~]# su - ruoze
[ruoze@ruozedata001 ~]$ mkdir app software sourcecode log tmp data lib
[ruoze@ruozedata001 ~]$ ll
total 0
drwxrwxr-x 2 ruoze ruoze 6 Nov 27 21:32 app 解压的文件夹 软连接
drwxrwxr-x 2 ruoze ruoze 6 Nov 27 21:32 data 数据
drwxrwxr-x 2 ruoze ruoze 6 Nov 27 21:32 lib 第三方的jar
drwxrwxr-x 2 ruoze ruoze 6 Nov 27 21:32 log 日志文件夹
drwxrwxr-x 2 ruoze ruoze 6 Nov 27 21:32 software 压缩包
drwxrwxr-x 2 ruoze ruoze 6 Nov 27 21:32 sourcecode 源代码编译
drwxrwxr-x 2 ruoze ruoze 6 Nov 27 21:32 tmp 临时文件夹 ???/tmp
[ruoze@ruozedata001 ~]$
3.2上传压缩包
[root@ruozedata001 ~]# mv /tmp/hadoop-2.6.0-cdh5.16.2.tar.gz /home/ruoze/software/
[root@ruozedata001 ~]# chown ruoze:ruoze /home/ruoze/software/*
[root@ruozedata001 ~]# ll /home/ruoze/software/
total 424176
-rw-r--r-- 1 ruoze ruoze 434354462 Jun 18 21:14 hadoop-2.6.0-cdh5.16.2.tar.gz
[ruoze@ruozedata001 software]$ ll
total 424176
-rw-r--r-- 1 ruoze ruoze 434354462 Jun 18 21:14 hadoop-2.6.0-cdh5.16.2.tar.gz
[ruoze@ruozedata001 software]$
3.3 解压
[ruoze@ruozedata001 software]$ tar -xzvf hadoop-2.6.0-cdh5.16.2.tar.gz -C ../app/
[ruoze@ruozedata001 software]$ cd ../
[ruoze@ruozedata001 ~]$ cd app
[ruoze@ruozedata001 app]$ ll
total 4
drwxr-xr-x 14 ruoze ruoze 4096 Jun 3 19:11 hadoop-2.6.0-cdh5.16.2
[ruoze@ruozedata001 app]$ ln -s hadoop-2.6.0-cdh5.16.2 hadoop
[ruoze@ruozedata001 app]$ ll
total 4
lrwxrwxrwx 1 ruoze ruoze 22 Nov 27 21:37 hadoop -> hadoop-2.6.0-cdh5.16.2
drwxr-xr-x 14 ruoze ruoze 4096 Jun 3 19:11 hadoop-2.6.0-cdh5.16.2
[ruoze@ruozedata001 app]$
https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/SingleCluster.html
http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.16.2/hadoop-project-dist/hadoop-common/SingleCluster.html
3.4环境要求
Required Software:
a.java
mkdir /usr/java
cd /usr/java
rz上传 jdk-8u45-linux-x64.gz包
[root@ruozedata001 java]# tar -xzvf jdk-8u45-linux-x64.gz
[root@ruozedata001 java]# ll
total 352420
drwxr-xr-x 8 root root 89 Nov 18 14:04 jdk-11.0.1
drwxr-xr-x 8 10 143 4096 Apr 11 2015 jdk1.8.0_45
-rw-r--r-- 1 root root 173271626 Nov 16 13:10 jdk-8u45-linux-x64.gz
-rw-r--r-- 1 root root 187599951 Nov 18 14:00 openjdk-11.0.1_linux-x64_bin.tar.gz
[root@ruozedata001 java]# chown -R root:root jdk1.8.0_45
[root@ruozedata001 java]# ll
total 352420
drwxr-xr-x 8 root root 89 Nov 18 14:04 jdk-11.0.1
drwxr-xr-x 8 root root 4096 Apr 11 2015 jdk1.8.0_45
-rw-r--r-- 1 root root 173271626 Nov 16 13:10 jdk-8u45-linux-x64.gz
-rw-r--r-- 1 root root 187599951 Nov 18 14:00 openjdk-11.0.1_linux-x64_bin.tar.gz
[root@ruozedata001 java]#
vi /etc/profile
#env
export JAVA_HOME=/usr/java/jdk1.8.0_45
export PATH=$JAVA_HOME/bin:$PATH
[root@ruozedata001 java]# source /etc/profile
[root@ruozedata001 java]# which java
/usr/java/jdk1.8.0_45/bin/java
b.ssh 必须安装
3.5 JAVA_HOME 显性配置
[ruoze@ruozedata001 hadoop]$ vi hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_45
Now you are ready to start your Hadoop cluster in one of the three supported modes:
Local (Standalone) Mode 本地模式 不用
Pseudo-Distributed Mode 伪分布式模式 学习 测试 1台
Fully-Distributed Mode 分布式模型 集群模式 生产
[root@ruozedata001 java]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.0.3 ruozedata001
[root@ruozedata001 java]#
3.6配置文件
etc/hadoop/core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://ruozedata001:9000</value>
</property>
</configuration>
etc/hadoop/hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
<!-- xxxx
yyy
-->
3.7 ssh无密码信任关系
ssh ruozedata date
user: ruoze
user:root 600权限没关系
3.8 环境变量 hadoop
[ruoze@ruozedata001 ~]$
[ruoze@ruozedata001 ~]$ vi .bashrc
# .bashrc
# Source global definitions
if [ -f /etc/bashrc ]; then
. /etc/bashrc
fi
# Uncomment the following line if you don't like systemctl's auto-paging feature:
# export SYSTEMD_PAGER=
export HADOOP_HOME=/home/ruoze/app/hadoop
export PATH=${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH
[ruoze@ruozedata001 ~]$ source .bashrc
[ruoze@ruozedata001 ~]$ which hadoop
~/app/hadoop/bin/hadoop
[ruoze@ruozedata001 ~]$
3.9 格式化
[ruoze@ruozedata001 ~]$ hdfs namenode -format
has been successfully formatted.
3.10 第一次启动
[ruoze@ruozedata001 ~]$
[ruoze@ruozedata001 ~]$ start-dfs.sh
19/11/27 22:18:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [ruozedata001]
ruozedata001: starting namenode, logging to /home/ruoze/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-ruoze-namenode-ruozedata001.out
The authenticity of host 'localhost (::1)' can't be established.
ECDSA key fingerprint is SHA256:OLqoaMxlGFbCq4sC9pYgF+FdbcXHbEbtSrnMiGGFbVw.
ECDSA key fingerprint is MD5:d3:5b:4a:ef:8e:00:41:a0:5e:80:ef:75:76:8a:a3:49.
Are you sure you want to continue connecting (yes/no)? yes
localhost: Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
localhost: starting datanode, logging to /home/ruoze/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-ruoze-datanode-ruozedata001.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is SHA256:OLqoaMxlGFbCq4sC9pYgF+FdbcXHbEbtSrnMiGGFbVw.
ECDSA key fingerprint is MD5:d3:5b:4a:ef:8e:00:41:a0:5e:80:ef:75:76:8a:a3:49.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /home/ruoze/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-ruoze-secondarynamenode-ruozedata001.out
19/11/27 22:19:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[ruoze@ruozedata001 ~]$ jps
151203 DataNode
151530 Jps
150875 NameNode
151406 SecondaryNameNode
[ruoze@ruozedata001 ~]$
坑:
[ruoze@ruozedata001 .ssh]$ cat known_hosts
ruozedata001,192.168.0.3 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBG9N5IGRTqwqGGHZcyNJ2i7lG54isK19GMq+Zw3VDIr64dS2sqoZ79n+8Ibz8ZJsU1aNiaJJTzYUvuxZv5W4iHQ=
localhost ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBG9N5IGRTqwqGGHZcyNJ2i7lG54isK19GMq+Zw3VDIr64dS2sqoZ79n+8Ibz8ZJsU1aNiaJJTzYUvuxZv5W4iHQ=
0.0.0.0 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBG9N5IGRTqwqGGHZcyNJ2i7lG54isK19GMq+Zw3VDIr64dS2sqoZ79n+8Ibz8ZJsU1aNiaJJTzYUvuxZv5W4iHQ=
[ruoze@ruozedata001 .ssh]$
ssh ruozedata001 date
3.11 DN SNN都以 ruozedata001启动
[ruoze@ruozedata001 ~]$ stop-dfs.sh
[ruoze@ruozedata001 ~]$ start-dfs.sh
19/11/27 22:21:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [ruozedata001]
ruozedata001: starting namenode, logging to /home/ruoze/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-ruoze-namenode-ruozedata001.out
localhost: starting datanode, logging to /home/ruoze/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-ruoze-datanode-ruozedata001.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/ruoze/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-ruoze-secondarynamenode-ruozedata001.out
19/11/27 22:22:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[ruoze@ruozedata001 ~]$ jps
152354 NameNode
152484 DataNode
152660 SecondaryNameNode
153194 Jps
[ruoze@ruozedata001 ~]$
NN:ruozedata001 fs.defaultFS控制的
DN: slaves文件
SNN:
dfs.namenode.secondary.http-address
ruozedata001:50090
dfs.namenode.secondary.https-address
ruozedata001:50091
[ruoze@ruozedata001 hadoop]$ start-dfs.sh
19/11/27 22:40:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [ruozedata001]
ruozedata001: starting namenode, logging to /home/ruoze/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-ruoze-namenode-ruozedata001.out
ruozedata001: starting datanode, logging to /home/ruoze/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-ruoze-datanode-ruozedata001.out
Starting secondary namenodes [ruozedata001]
ruozedata001: starting secondarynamenode, logging to /home/ruoze/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-ruoze-secondarynamenode-ruozedata001.out
19/11/27 22:40:34 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[ruoze@ruozedata001 hadoop]$
3.12
namenode 名称节点 老大 读写请求先经过它 主节点
datanode 数据节点 小弟 存储数据 检索数据 从节点
secondarynamenode 第二名称节点 老二 h+1
大数据组件基本都是主从架构 hdfs hbase(读写请求不经过老大 master进程)
http://ruozedata001:50070
Safemode is off.
3.13 hadoop基本命令
fs.defaultFS
hdfs://ruozedata001:9000
hadoop fs -mkdir /
hadoop fs -put
hadoop fs -get
hadoop fs -cat
hadoop fs -rm
hadoop fs -ls