Hdfs磁盘存储策略和预留空间配置

一、Hdfs磁盘存储策略
1、 指定本地目录存储策略
data目录为Hot策略对应DISK;
data1目录为Cold策略对应ARCHIVE;

dfs.datanode.data.dir
[DISK]/opt/beh/data/namenode/dfs/data,[ARCHIVE]/opt/beh/data/namenode/dfs/data1

重启hdfs
$ stop-dfs.sh
$ start-dfs.sh
2、指定hdfs目录的存储策略
查看hdfs存储策略
$ hdfs storagepolicies -listPolicies
Block Storage Policies:
BlockStoragePolicy{COLD:2, storageTypes=[ARCHIVE], creationFallbacks=[], replicationFallbacks=[]}
BlockStoragePolicy{WARM:5, storageTypes=[DISK, ARCHIVE], creationFallbacks=[DISK, ARCHIVE], replicationFallbacks=[DISK, ARCHIVE]}
BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
BlockStoragePolicy{ONE_SSD:10, storageTypes=[SSD, DISK], creationFallbacks=[SSD, DISK], replicationFallbacks=[SSD, DISK]}
BlockStoragePolicy{ALL_SSD:12, storageTypes=[SSD], creationFallbacks=[DISK], replicationFallbacks=[DISK]}
BlockStoragePolicy{LAZY_PERSIST:15, storageTypes=[RAM_DISK, DISK], creationFallbacks=[DISK], replicationFallbacks=[DISK]}
创建2个hdfs目录
$ hadoop fs -mkdir /Cold_data
$ hadoop fs -mkdir /Hot_data
指定hdfs目录存储策略
$ hdfs storagepolicies -setStoragePolicy -path hdfs://breath:9000/Cold_data -policy COLD
Set storage policy COLD on hdfs://breath:9000/Cold_data
$ hdfs storagepolicies -setStoragePolicy -path hdfs://breath:9000/Hot_data -policy HOT
Set storage policy HOT on hdfs://breath:9000/Hot_data
查看2个目录的存储策略是否正确
$ hdfs storagepolicies -getStoragePolicy -path /Cold_data
The storage policy of /Cold_data:
BlockStoragePolicy{COLD:2, storageTypes=[ARCHIVE], creationFallbacks=[], replicationFallbacks=[]}
$ hdfs storagepolicies -getStoragePolicy -path /Hot_data
The storage policy of /Hot_data:
BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
3、存储测试
查看未上传文件存储目录的大小
$ cd /opt/beh/data/namenode/dfs
$ du -sh *
38M data
16K data1
30M name
14M namesecondary
生成一个1000M大小的文件
$ dd if=/dev/zero of=test.txt bs=1000M count=1

记录了1+0 的读入
记录了1+0 的写出
1048576000字节(1.0 GB)已复制,3.11214 秒,337 MB/秒
将生成的文件上传到/Cold_data目录
$ hadoop fs -put test.txt /Cold_data
[x] 查看此时存储目录的大小
$ du -sh *
38M data
1008M data1
30M name
14M namesecondary
4、测试结果说明
上传的文件全部存储在了data1目录下

因为hdfs上的/Cold_data指定的是COLD 策略,与hdfs-site.xml里面ARCHIVE策略的data1目录相对应,所以文件存储达到了测试目的

二、Hdfs预留空间配置
1、参数修改
修改hdfs-site.xml配置文件,添加参数

dfs.datanode.du.reserved
32212254720

dfs.datanode.data.dir [ARCHIVE]/opt/beh/data/namenode/dfs/data 说明 设置dfs.datanode.du.reserved参数,32212254720表示指定预留空间为30G;

修改dfs.datanode.data.dir,只保留一个本地存储目录;

-重启hdfs

$ stop-dfs.sh
$ start-dfs.sh
2、上传文件
查看磁盘空间
$ df -h
文件系统 容量 已用 可用 已用% 挂载点
/dev/mapper/centos-root 46G 14G 32G 31% /
devtmpfs 7.8G 0 7.8G 0% /dev
tmpfs 7.8G 0 7.8G 0% /dev/shm
tmpfs 7.8G 8.5M 7.8G 1% /run
tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup
/dev/vda1 497M 125M 373M 25% /boot
tmpfs 1.6G 0 1.6G 0% /run/user/0
tmpfs 1.6G 0 1.6G 0% /run/user/1000
往hdfs上上传文件,一次上传一个2G大小的文件
$ hadoop fs -put test1.txt /Cold_data/test1.txt
$ hadoop fs -put test1.txt /Cold_data/test2.txt



$ hadoop fs -put test1.txt /Cold_data/test7.txt
$ hadoop fs -put test1.txt /Cold_data/test8.txt
16/11/12 16:30:54 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2239)
at org.apache.hadoop.hdfs.DFSOutputStream D a t a S t r e a m e r . c r e a t e B l o c k O u t p u t S t r e a m ( D F S O u t p u t S t r e a m . j a v a : 1451 ) a t o r g . a p a c h e . h a d o o p . h d f s . D F S O u t p u t S t r e a m DataStreamer.createBlockOutputStream(DFSOutputStream.java:1451) at org.apache.hadoop.hdfs.DFSOutputStream DataStreamer.createBlockOutputStream(DFSOutputStream.java:1451)atorg.apache.hadoop.hdfs.DFSOutputStreamDataStreamer.nextBlockOutputStream(DFSOutputStream.java:1373)
at org.apache.hadoop.hdfs.DFSOutputStream D a t a S t r e a m e r . r u n ( D F S O u t p u t S t r e a m . j a v a : 600 ) 16 / 11 / 1216 : 30 : 54 I N F O h d f s . D F S C l i e n t : A b a n d o n i n g B P − 456596110 − 192.168.134.129 − 1450512233024 : b l k 1 07374407 6 3 25416 / 11 / 1216 : 30 : 54 I N F O h d f s . D F S C l i e n t : E x c l u d i n g d a t a n o d e D a t a n o d e I n f o W i t h S t o r a g e [ 10.10.1.31 : 50010 , D S − 01 c 3 c 362 − 44 f 4 − 46 e b − a 8 d 8 − 57 d 2 c 2 d 5 f 196 , A R C H I V E ] 16 / 11 / 1216 : 30 : 54 W A R N h d f s . D F S C l i e n t : D a t a S t r e a m e r E x c e p t i o n o r g . a p a c h e . h a d o o p . i p c . R e m o t e E x c e p t i o n ( j a v a . i o . I O E x c e p t i o n ) : F i l e / C o l d d a t a / t e s t 8. t x t . C O P Y I N G c o u l d o n l y b e r e p l i c a t e d t o 0 n o d e s i n s t e a d o f m i n R e p l i c a t i o n ( = 1 ) . T h e r e a r e 1 d a t a n o d e ( s ) r u n n i n g a n d 1 n o d e ( s ) a r e e x c l u d e d i n t h i s o p e r a t i o n . a t o r g . a p a c h e . h a d o o p . h d f s . s e r v e r . b l o c k m a n a g e m e n t . B l o c k M a n a g e r . c h o o s e T a r g e t 4 N e w B l o c k ( B l o c k M a n a g e r . j a v a : 1541 ) a t o r g . a p a c h e . h a d o o p . h d f s . s e r v e r . n a m e n o d e . F S N a m e s y s t e m . g e t A d d i t i o n a l B l o c k ( F S N a m e s y s t e m . j a v a : 3289 ) a t o r g . a p a c h e . h a d o o p . h d f s . s e r v e r . n a m e n o d e . N a m e N o d e R p c S e r v e r . a d d B l o c k ( N a m e N o d e R p c S e r v e r . j a v a : 668 ) a t o r g . a p a c h e . h a d o o p . h d f s . s e r v e r . n a m e n o d e . A u t h o r i z a t i o n P r o v i d e r P r o x y C l i e n t P r o t o c o l . a d d B l o c k ( A u t h o r i z a t i o n P r o v i d e r P r o x y C l i e n t P r o t o c o l . j a v a : 212 ) a t o r g . a p a c h e . h a d o o p . h d f s . p r o t o c o l P B . C l i e n t N a m e n o d e P r o t o c o l S e r v e r S i d e T r a n s l a t o r P B . a d d B l o c k ( C l i e n t N a m e n o d e P r o t o c o l S e r v e r S i d e T r a n s l a t o r P B . j a v a : 483 ) a t o r g . a p a c h e . h a d o o p . h d f s . p r o t o c o l . p r o t o . C l i e n t N a m e n o d e P r o t o c o l P r o t o s DataStreamer.run(DFSOutputStream.java:600) 16/11/12 16:30:54 INFO hdfs.DFSClient: Abandoning BP-456596110-192.168.134.129-1450512233024:blk_1073744076_3254 16/11/12 16:30:54 INFO hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[10.10.1.31:50010,DS-01c3c362-44f4-46eb-a8d8-57d2c2d5f196,ARCHIVE] 16/11/12 16:30:54 WARN hdfs.DFSClient: DataStreamer Exception org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /Cold_data/test8.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1541) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3289) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:668) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:212) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:483) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos DataStreamer.run(DFSOutputStream.java:600)16/11/1216:30:54INFOhdfs.DFSClient:AbandoningBP456596110192.168.134.1291450512233024:blk1073744076325416/11/1216:30:54INFOhdfs.DFSClient:ExcludingdatanodeDatanodeInfoWithStorage[10.10.1.31:50010,DS01c3c36244f446eba8d857d2c2d5f196,ARCHIVE]16/11/1216:30:54WARNhdfs.DFSClient:DataStreamerExceptionorg.apache.hadoop.ipc.RemoteException(java.io.IOException):File/Colddata/test8.txt.COPYINGcouldonlybereplicatedto0nodesinsteadofminReplication(=1).Thereare1datanode(s)runningand1node(s)areexcludedinthisoperation.atorg.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1541)atorg.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3289)atorg.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:668)atorg.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:212)atorg.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:483)atorg.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtosClientNamenodeProtocol 2. c a l l B l o c k i n g M e t h o d ( C l i e n t N a m e n o d e P r o t o c o l P r o t o s . j a v a ) a t o r g . a p a c h e . h a d o o p . i p c . P r o t o b u f R p c E n g i n e 2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine 2.callBlockingMethod(ClientNamenodeProtocolProtos.java)atorg.apache.hadoop.ipc.ProtobufRpcEngineServer P r o t o B u f R p c I n v o k e r . c a l l ( P r o t o b u f R p c E n g i n e . j a v a : 619 ) a t o r g . a p a c h e . h a d o o p . i p c . R P C ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) at org.apache.hadoop.ipc.RPC ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)atorg.apache.hadoop.ipc.RPCServer.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler 1. r u n ( S e r v e r . j a v a : 2044 ) a t o r g . a p a c h e . h a d o o p . i p c . S e r v e r 1.run(Server.java:2044) at org.apache.hadoop.ipc.Server 1.run(Server.java:2044)atorg.apache.hadoop.ipc.ServerHandler 1. r u n ( S e r v e r . j a v a : 2040 ) a t j a v a . s e c u r i t y . A c c e s s C o n t r o l l e r . d o P r i v i l e g e d ( N a t i v e M e t h o d ) a t j a v a x . s e c u r i t y . a u t h . S u b j e c t . d o A s ( S u b j e c t . j a v a : 415 ) a t o r g . a p a c h e . h a d o o p . s e c u r i t y . U s e r G r o u p I n f o r m a t i o n . d o A s ( U s e r G r o u p I n f o r m a t i o n . j a v a : 1671 ) a t o r g . a p a c h e . h a d o o p . i p c . S e r v e r 1.run(Server.java:2040) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.ipc.Server 1.run(Server.java:2040)atjava.security.AccessController.doPrivileged(NativeMethod)atjavax.security.auth.Subject.doAs(Subject.java:415)atorg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)atorg.apache.hadoop.ipc.ServerHandler.run(Server.java:2038)

    at org.apache.hadoop.ipc.Client.call(Client.java:1468)
    at org.apache.hadoop.ipc.Client.call(Client.java:1399)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
    at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1544)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:600)

put: File /Cold_data/test8.txt.COPYING could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
分析
此时数据目录/opt/beh/data/namenode/dfs的空间大小如下

$ cd /opt/beh/data/namenode/dfs
$ du -sh *
15G data
12K data1
34M name
19M namesecondary
[x] 查看此时的磁盘空间
$ df -h
文件系统 容量 已用 可用 已用% 挂载点
/dev/mapper/centos-root 46G 27G 19G 59% /
devtmpfs 7.8G 0 7.8G 0% /dev
tmpfs 7.8G 0 7.8G 0% /dev/shm
tmpfs 7.8G 8.5M 7.8G 1% /run
tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup
/dev/vda1 497M 125M 373M 25% /boot
tmpfs 1.6G 0 1.6G 0% /run/user/0
tmpfs 1.6G 0 1.6G 0% /run/user/1000
3、总结
出现报错说明磁盘预留空间配置生效,但是查看磁盘空间可以看到,本地目录剩余可用空间并不是Hdfs设置的预留空间;
Hdfs对一个数据目录的可用存储认定是当前目录所在磁盘的总空间(此处为/目录46G),并不是当前目录的可用空间。
实际上的HDFS的剩余空间计算:
当前目录(磁盘)的总空间46G - Hdfs已使用的总空间15G=31G

而此时预留空间为30G,因此hdfs剩余的可用空间为1G,所以当再次上传一个大小为2G的文件时,出现以上的报错。

因为此处测试直接使用了/目录的存储,其它非Hdfs占用了部分空间,当hdfs的数据目录对单块磁盘一一对应,每块磁盘的剩余可用空间大小与预留空间配置的值相当时,就不会再往该磁盘写入数据。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值