运行mapreduce报错CannotObtainBlockLengthException: Cannot obtain block length for LocatedBlock

报错日志如下:

Error: java.io.IOException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
        at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:271)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:144)
        at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:205)
        at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:191)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:257)
        ... 11 more
Caused by: org.apache.hadoop.hdfs.CannotObtainBlockLengthException: Cannot obtain block length for LocatedBlock{BP-438308737--1615993069368:blk_1073893685_152906; getBlockSize()=949; corrupt=false; offset=0; locs=[DatanodeInfoWithStorage[:9866,DS-3c8778e1-fc6a-46f8-b774-a270aa727cce,DISK], DatanodeInfoWithStorage[:9866,DS-1cbdb041-f016-4b4a-8054-2f10171f2856,DISK], DatanodeInfoWithStorage[:9866,DS-843c0421-6683-4070-a6a1-1b761ac0ad28,DISK]]}
        at org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:364)
        at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:271)
        at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:202)
        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:186)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1012)
        at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:317)
        at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:313)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:325)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:898)
        at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:109)
        at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
        at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:68)
        ... 16 more

原因分析:

上面这个报错是我在执行hive查询的时候报的错,这是一个mapreduce读取数据时候报错,是读取文件的一个数据块异常的问题

我这里出现这个问题是flume写数据到hdfs要注意的一个地方,因为我之前关闭了hdfs一段时间,导致flume数据到hdfs文件没有正常关闭导致的,我们需要找到这些数据块,修复他们。

解决方法

1.通过命令hdfs fsck /user/data -openforwrite 检查哪些文件处于打开写的过程一直未关闭。这里我的检查hdfs目录是/user/data,可以看到我有很多文件都是打开状态。但是这里有些文件是当前正在写的文件,我们要处理的是前一天的异常文件,当天的文件是正常的不要处理。

hdfs fsck /user/data -openforwrite 
Connecting to namenode via http://namenode:9870/fsck?ugi=root&openforwrite=1&path=%2Fuser%2Fdata                  
FSCK started by root (auth:SIMPLE) from / for path /user/data at Tue May 18 09:31:48 CST 2021          
**/user/data/corax/odl/odl_corax_simulation_dinc_xzh/FlumeData.1621243169597 77 bytes, replicated: replication=3, 1 block(s), OPENFORWRITE: /user/data/corax/odl/odl_corax_simulation_dinc_xzh/FlumeData.1621300542099.tmp 39 bytes, replicated: replication=3, 1 block(s), OPENFORWRITE: /user/data/dove/odl_dove_driver_dinc_liquan/dt=20210517/FlumeData.1621242283471 5984 bytes, replicated: replication=3, 1 block(s), OPENFORWRITE: /user/data/dove/odl_dove_driver_dinc_liquan/dt=20210518/FlumeData.1621299630705.tmp 3180 bytes, replicated: replication=3, 1 block(s), OPENFORWRITE: /user/data/owl/odl_owl_result_dinc_liquan/dt=20210517/FlumeData.1621242953065 949 bytes, replicated: replication=3, 1 block(s), OPENFORWRITE: /user/data/owl/odl_owl_result_dinc_liquan/dt=20210518/FlumeData.1621299637978.tmp 963 bytes, replicated: replication=3, 1 block(s), OPENFORWRITE: /user/data/owl/odl_owl_x_dinc_liquan/dt=20210517/FlumeData.1621242703709 725 bytes, replicated: replication=3, 1 block(s), OPENFORWRITE: /user/data/owl/odl_owl_x_dinc_liquan/dt=20210518/FlumeData.1621299655555.tmp 729 bytes, replicated: replication=3, 1 block(s), OPENFORWRITE: /user/data/wildgoose/odl_wildgoose_bestdispatch_dinc_liquan/dt=20210517/FlumeData.1621242397308 752 bytes, replicated: replication=3, 1 block(s), OPENFORWRITE: /user/data/wildgoose/odl_wildgoose_bestdispatch_dinc_liquan/dt=20210518/FlumeData.1621299894142.tmp 661 bytes, replicated: replication=3, 1 block(s), OPENFORWRITE: /user/data/wildgoose/odl_wildgoose_dispatch_dinc_liquan/dt=20210517/FlumeData.1621242394169 2736 bytes, replicated: replication=3, 1 block(s), OPENFORWRITE: /user/data/wildgoose/odl_wildgoose_dispatch_dinc_liquan/dt=20210518/FlumeData.1621299850021.tmp 1331 bytes, replicated: replication=3, 1 block(s), OPENFORWRITE: /user/data/wildgoose/odl_wildgoose_driver_dinc_liquan/dt=20210517/FlumeData.1621242258264 5356 bytes, replicated: replication=3, 1 block(s), OPENFORWRITE: /user/data/wildgoose/odl_wildgoose_driver_dinc_liquan/dt=20210518/FlumeData.1621299640681.tmp 1479 bytes, replicated: replication=3, 1 block(s), OPENFORWRITE: /user/data/wildgoose/odl_wildgoose_order_dinc_liquan/dt=20210517/FlumeData.1621242387643 279 bytes, replicated: replication=3, 1 block(s), OPENFORWRITE: /user/data/wildgoose/odl_wildgoose_order_dinc_liquan/dt=20210518/FlumeData.1621299766144.tmp 279 bytes, replicated: replication=3, 1 block(s), OPENFORWRITE:** 
Status: HEALTHY
 Number of data-nodes:  3
 Number of racks:               1
 Total dirs:                    644
 Total symlinks:                0

Replicated Blocks:
 Total size:    13585860581 B
 Total files:   9459
 Total blocks (validated):      9255 (avg. block size 1467948 B)
 Minimally replicated blocks:   9239 (99.82712 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     2.9948137
 Missing blocks:                0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Blocks queued for replication: 0

Erasure Coded Block Groups:
 Total size:    0 B
 Total files:   0
 Total block groups (validated):        0
 Minimally erasure-coded block groups:  0
 Over-erasure-coded block groups:       0
 Under-erasure-coded block groups:      0
 Unsatisfactory placement block groups: 0
 Average block group size:      0.0
 Missing block groups:          0
 Corrupt block groups:          0
 Missing internal blocks:       0
 Blocks queued for replication: 0
FSCK ended at Tue May 18 09:31:48 CST 2021 in 112 milliseconds

2.通过如下命令修复对应文件即可

hdfs debug recoverLease -path /user/data/wildgoose/odl_wildgoose_order_dinc_liquan/dt=20210517/FlumeData.1621242387643  -retries 3

修复成功提示:

/FlumeData.1621242387643  -retries 3
recoverLease returned false.
Retrying in 5000 ms...
Retry #1
recoverLease SUCCEEDED on /user/data/wildgoose/odl_wildgoose_order_dinc_liquan/dt=20210517/FlumeData.1621242387643

参考:https://blog.csdn.net/jiandequn/article/details/103292966

评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值