HDFS详解(一)
1.三个进程
NameNode (NN): 名称节点 --》client(客户机)第一个操作的对象
DataNode (DN): 数据节点 --》存储数据的
Secondary NameNode(SNN): 第二名称节点
2.block(数据块)
大小:
64M
128M
参数: dfs.blocksize
3.副本数
dfs.replication(配置参数) : 3
一个块会变为3个块
4.案例
eg:
1个文件130M : 128M、 2M 两个块
实际存储: 130M*3(存储了三个副本)
多少个块: 6
悬念(面试题):
多出2M,占一个数据块,会有问题?
答案:会维护在NN的内存,会可能oom(溢出)
比如文件都是小文件,3M、5M ?
合并小文件/设计的时候,尽量让一个文件是120--128M
5.架构设计
NameNode: 文件系统的命名空间 (面试题)
NameNode存储的是HDFS的什么数据??
1.文件名称
2.文件目录结构
3.文件的属性(权限 创建时间 副本数)
4.文件对应哪些数据块--》这些数据块对应哪些DataNode节点上
不会持久化存储这个映射关系,是通过集群的启动和运行时,datanode定期发送blockReport给NN,
以此NN在【内存】中动态维护这种映射关系。
存储:维护文件系统树及整个树内的所有文件和目录,这些信息以两种文件形式永久保存在本地磁盘上
命名空间镜像文件fsimage+编辑日志editlog
DataNode:
存储: 数据块+数据块校验和
与NN通信:
1.每隔3秒发送一次心跳 参数可配置
2.每隔10次心跳发送一次blockReport (30s)
Secondary NameNode:
存储: 命名空间镜像文件fsimage+编辑日志editlog
作用: 定期合并fsimage+editlog文件为新的fsimage,推送给NN,称为检查点,checkpoint
参数: dfs.namenode.checkpoint.period: 3600 秒
6.机架 rack
机柜,每个机柜分别有各自的 IP 段
7.文件读流程--》FSDataInputStream (面试题)
eg:
[hadoop@xkhadoop ~]$ hdfs hfs -ls /
Error: Could not find or load main class hfs
[hadoop@xkhadoop ~]$ hdfs dfs -ls /
Found 3 items
-rw-r--r-- 1 hadoop supergroup 4 2018-11-28 22:16 /text.log
drwx------ - hadoop supergroup 0 2018-11-25 20:10 /tmp
drwxr-xr-x - hadoop supergroup 0 2018-11-25 20:10 /user
[hadoop@xkhadoop ~]$ hdfs dfs -cat /text.log
123
7.1 client通过FileSystem.open(filePath)方法,去NN进行RPC通信,返回该文件的部分或者全部的block块,
也就是返回FSDataInputStream对象;
8.文件写流程--》FSDataOutputStream (面试题)
[hadoop@xkhadoop ~]$ echo "43564" >xk.log
[hadoop@xkhadoop ~]$ hdfs dfs -put xk.log /
[hadoop@xkhadoop ~]$ hdfs dfs -ls /
Found 4 items
-rw-r--r-- 1 hadoop supergroup 4 2018-11-28 22:16 /text.log
drwx------ - hadoop supergroup 0 2018-11-25 20:10 /tmp
drwxr-xr-x - hadoop supergroup 0 2018-11-25 20:10 /user
-rw-r--r-- 1 hadoop supergroup 6 2018-12-03 00:39 /xk.log
[hadoop@xkhadoop002 hadoop]$ bin/hdfs dfs -put xk.log /xxx :把文件传到hdfs里面再hdfs根目录重命名为xxx
9.命令jps 查看进程
9.1正常处理流程
eg:
[hadoop@xkhadoop ~]$ jps
4752 ResourceManager
5365 NameNode
5579 SecondaryNameNode
5182 DataNode
5038 NodeManager
8127 Jps
[hadoop@xkhadoop ~]$ ps -ef |grep 4752
hadoop 4752 1 0 Dec02 ? 00:02:31 /usr/java/jdk1.8.0_45/bin/java -Dproc_resourcemanager -Xmx1000m -Dhadoop.log.dir=/opt/software/hadoop/logs -Dyarn.log.dir=/opt/software/hadoop/logs -Dhadoop.log.file=yarn-hadoop-resourcemanager-xkhadoop.log -Dyarn.log.file=yarn-hadoop-resourcemanager-xkhadoop.log -Dyarn.home.dir= -Dyarn.id.str=hadoop -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/software/hadoop/lib/native -Dyarn.policy.file=hadoop-policy.xml -Dhadoop.log.dir=/opt/software/hadoop/logs -Dyarn.log.dir=/opt/software/hadoop/logs -Dhadoop.log.file=yarn-hadoop-resourcemanager-xkhadoop.log -Dyarn.log.file=yarn-hadoop-resourcemanager-xkhadoop.log -Dyarn.home.dir=/opt/software/hadoop -Dhadoop.home.dir=/opt/software/hadoop -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/software/hadoop/lib/native -classpath /opt/software/hadoop-2.8.1/etc/hadoop:/opt/software/hadoop-2.8.1/etc/hadoop:/opt/software/hadoop-2.8.1/etc/hadoop:/opt/software/hadoop/share/hadoop/common/lib/*:/opt/software/hadoop/share/hadoop/common/*:/opt/software/hadoop/share/hadoop/hdfs:/opt/software/hadoop/share/hadoop/hdfs/lib/*:/opt/software/hadoop/share/hadoop/hdfs/*:/opt/software/hadoop/share/hadoop/yarn/lib/*:/opt/software/hadoop/share/hadoop/yarn/*:/opt/software/hadoop/share/hadoop/mapreduce/lib/*:/opt/software/hadoop/share/hadoop/mapreduce/*:/opt/software/hadoop/contrib/capacity-scheduler/*.jar:/opt/software/hadoop/contrib/capacity-scheduler/*.jar:/opt/software/hadoop/contrib/capacity-scheduler/*.jar:/opt/software/hadoop/share/hadoop/yarn/*:/opt/software/hadoop/share/hadoop/yarn/lib/*:/opt/software/hadoop-2.8.1/etc/hadoop/rm-config/log4j.properties org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
hadoop 8141 7813 0 00:42 pts/1 00:00:00 grep 4752
9.2异常处理流程
eg:
[root@xkhadoop ~]# jps
4752 -- process information unavailable
5365 -- process information unavailable
8170 Jps
5579 -- process information unavailable
5182 -- process information unavailable
5038 -- process information unavailable
[root@xkhadoop ~]# ps -ef |grep 4752
hadoop 4752 1 0 Dec02 ? 00:02:31 /usr/java/jdk1.8.0_45/bin/java -Dproc_resourcemanager -Xmx1000m -Dhadoop.log.dir=/opt/software/hadoop/logs -Dyarn.log.dir=/opt/software/hadoop/logs -Dhadoop.log.file=yarn-hadoop-resourcemanager-xkhadoop.log -Dyarn.log.file=yarn-hadoop-resourcemanager-xkhadoop.log -Dyarn.home.dir= -Dyarn.id.str=hadoop -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/software/hadoop/lib/native -Dyarn.policy.file=hadoop-policy.xml -Dhadoop.log.dir=/opt/software/hadoop/logs -Dyarn.log.dir=/opt/software/hadoop/logs -Dhadoop.log.file=yarn-hadoop-resourcemanager-xkhadoop.log -Dyarn.log.file=yarn-hadoop-resourcemanager-xkhadoop.log -Dyarn.home.dir=/opt/software/hadoop -Dhadoop.home.dir=/opt/software/hadoop -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/software/hadoop/lib/native -classpath /opt/software/hadoop-2.8.1/etc/hadoop:/opt/software/hadoop-2.8.1/etc/hadoop:/opt/software/hadoop-2.8.1/etc/hadoop:/opt/software/hadoop/share/hadoop/common/lib/*:/opt/software/hadoop/share/hadoop/common/*:/opt/software/hadoop/share/hadoop/hdfs:/opt/software/hadoop/share/hadoop/hdfs/lib/*:/opt/software/hadoop/share/hadoop/hdfs/*:/opt/software/hadoop/share/hadoop/yarn/lib/*:/opt/software/hadoop/share/hadoop/yarn/*:/opt/software/hadoop/share/hadoop/mapreduce/lib/*:/opt/software/hadoop/share/hadoop/mapreduce/*:/opt/software/hadoop/contrib/capacity-scheduler/*.jar:/opt/software/hadoop/contrib/capacity-scheduler/*.jar:/opt/software/hadoop/contrib/capacity-scheduler/*.jar:/opt/software/hadoop/share/hadoop/yarn/*:/opt/software/hadoop/share/hadoop/yarn/lib/*:/opt/software/hadoop-2.8.1/etc/hadoop/rm-config/log4j.properties org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
root 8181 8151 0 00:44 pts/2 00:00:00 grep 4752
发现这个进程是hadoop用户打开的
登陆到hadoop用户查看这个进程是否还需要使用?
要是不需要使用:
先kill -9杀死进程
再删除下面文件夹中对应的进程文件
进程文件夹格式:hsperfdata_+用户名
[root@xkhadoop tmp]# cd /tmp/hsperfdata_hadoop/
[root@xkhadoop hsperfdata_hadoop]# ll
total 160
-rw-------. 1 hadoop hadoop 32768 Dec 3 00:47 4752
-rw-------. 1 hadoop hadoop 32768 Dec 3 00:47 5038
-rw-------. 1 hadoop hadoop 32768 Dec 3 00:46 5182
-rw-------. 1 hadoop hadoop 32768 Dec 3 00:47 5365
-rw-------. 1 hadoop hadoop 32768 Dec 3 00:47 5579
10.hadoop和hdfs 文件系统命令
hadoop fs
等价
hdfs dfs
eg:
[hadoop@xkhadoop hadoop]$ bin/hdfs dfs -ls /
[hadoop@xkhadoop hadoop]$ bin/hdfs dfs -mkdir -p /xk/001
[hadoop@xkhadoop hadoop]$ bin/hdfs dfs -cat /test.log
[hadoop@xkhadoop hadoop]$ bin/hdfs dfs -put xk.log1 /xk/001
[hadoop@xkhadoop hadoop]$ bin/hdfs dfs -get /xk/001/xk.log1 /tmp/
[hadoop@xkhadoop hadoop]$ bin/hdfs dfs -get /xk/001/xk.log1 /tmp/xk.log123 重命名
[-moveFromLocal <localsrc> ... <dst>] 等价于put
[-moveToLocal <src> <localdst>] 等价于get
删除:
1.配置回收站
[-rm [-f] [-r|-R] [-skipTrash] [-safely] <src> ...]
core-site.xml
fs.trash.interval : 10080
2.测试
bin/hdfs dfs -rm -r -f /xxxx ---》进入回收站,是可以恢复的
bin/hdfs dfs -rm -r -f -skipTrash /xxxx ---》不进入回收站,是不可以恢复的
[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
[-chown [-R] [OWNER][:[GROUP]] PATH...]
11.HDFS启动顺序(面试题)
start-all.sh
nn dn snn
rm nm
12.hdfs dfsadmin
[hadoop@xkhadoop ~]$ hdfs dfsadmin -report
手动进入到HDFS的安全模式,安全模式是HDFS的一种自我保护机制,只能读文件,不能写文件
[hadoop@xkhadoop ~]$ hdfs dfsadmin -safemode enter
Safe mode is ON
离开安全模式
[hadoop@xkhadoop ~]$ hdfs dfsadmin -safemode leave
Safe mode is OFF
13.检核机器有没有副本缺失
[hadoop@xkhadoop ~]$ hdfs fsck /
问:
1.多台机器的磁盘存储分布不均匀?
解决方案: 1.1 不加新机器,原机器的磁盘分布不均匀:
[hadoop@xkhadoop002 ~]$ hdfs dfsadmin -setBalancerBandwidth 52428800
Balancer bandwidth is set to 52428800
[hadoop@xkhadoop ~]$
[hadoop@xkhadoop sbin]$ ./start-balancer.sh
等价
[hadoop@xkhadoop sbin]$ hdfs balancer
Apache Hadoop集群环境: shell脚本每晚业务低谷时调度
CDH集群环境: 忽略
1.2 加新机器,原机器的磁盘比如450G(500G),现在的新机器磁盘规格是5T
在业务低谷时,先将多台新机器加入到HDFS,做DN;
然后选一台的DN下架掉,等待hdfs自我修复块,恢复3份(网络和io最高的,也是最有风险性的)
2.一台机器的多个磁盘分布不均匀?
2.1.无论加不加磁盘,且多块磁盘的分布不均匀
官网:https://hadoop.apache.org/docs/r3.0.0-alpha2/hadoop-project-dist/hadoop-hdfs/HDFSDiskbalancer.html
Hadoop3X新特性,CDH5.12以后版本有这个功能
hdfs diskbalancer -plan node1.mycluster.com
hdfs diskbalancer -execute /system/diskbalancer/nodename.plan.json
打印出hadoop的环境变量
[hadoop@xkhadoop sbin]$ hadoop classpath