数据中台-DataX文件写入索引越界问题处理

DataX-写入HDFS时报错索引越界IndexOutOfBoundsException

详细报错日志如下

2023-02-03 09:45:35.355 [0-0-0-writer] ERROR HdfsWriter$Job - 写文件文件[hdfs://TEST-BIGDATA-01:8020/user/hive/warehouse/ods.db/ods_flow_order__1e53e37b_8204_49c3_b768_6d86dba6e18d/flow_order_index__01f7dbfc_5b76_45e6_9705_ac9021c4e234]时发生IO异常,请检查您的网络是否正常!
2023-02-03 09:45:35.355 [0-0-0-writer] INFO  HdfsWriter$Job - start delete tmp dir [hdfs://TEST-BIGDATA-01:8020/user/hive/warehouse/ods.db/ods_flow_order__1e53e37b_8204_49c3_b768_6d86dba6e18d] .
2023-02-03 09:45:35.358 [0-0-0-writer] INFO  HdfsWriter$Job - finish delete tmp dir [hdfs://TEST-BIGDATA-01:8020/user/hive/warehouse/ods.db/ods_flow_order__1e53e37b_8204_49c3_b768_6d86dba6e18d] .
2023-02-03 09:45:35.361 [0-0-0-writer] ERROR WriterRunner - Writer Runner Received Exceptions:
com.alibaba.datax.common.exception.DataXException: Code:[HdfsWriter-04], Description:[您配置的文件在写入时出现IO异常.]. - java.lang.IndexOutOfBoundsException: Index: 54, Size: 54
        at java.util.ArrayList.rangeCheck(ArrayList.java:657)
        at java.util.ArrayList.get(ArrayList.java:433)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.transportOneRecord(HdfsHelper.java:503)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.orcFileStartWrite(HdfsHelper.java:391)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsWriter$Task.startWrite(HdfsWriter.java:416)
        at com.alibaba.datax.core.taskgroup.runner.WriterRunner.run(WriterRunner.java:56)
        at java.lang.Thread.run(Thread.java:748)
 - java.lang.IndexOutOfBoundsException: Index: 54, Size: 54
        at java.util.ArrayList.rangeCheck(ArrayList.java:657)
        at java.util.ArrayList.get(ArrayList.java:433)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.transportOneRecord(HdfsHelper.java:503)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.orcFileStartWrite(HdfsHelper.java:391)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsWriter$Task.startWrite(HdfsWriter.java:416)
        at com.alibaba.datax.core.taskgroup.runner.WriterRunner.run(WriterRunner.java:56)
        at java.lang.Thread.run(Thread.java:748)

        at com.alibaba.datax.common.exception.DataXException.asDataXException(DataXException.java:48) ~[datax-common-0.0.1-SNAPSHOT.jar:na]
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.orcFileStartWrite(HdfsHelper.java:402) ~[hdfswriter-0.0.1-SNAPSHOT.jar:na]
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsWriter$Task.startWrite(HdfsWriter.java:416) ~[hdfswriter-0.0.1-SNAPSHOT.jar:na]
        at com.alibaba.datax.core.taskgroup.runner.WriterRunner.run(WriterRunner.java:56) ~[datax-core-0.0.1-SNAPSHOT.jar:na]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_212]
Caused by: java.lang.IndexOutOfBoundsException: Index: 54, Size: 54
        at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[na:1.8.0_212]
        at java.util.ArrayList.get(ArrayList.java:433) ~[na:1.8.0_212]
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.transportOneRecord(HdfsHelper.java:503) ~[hdfswriter-0.0.1-SNAPSHOT.jar:na]
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.orcFileStartWrite(HdfsHelper.java:391) ~[hdfswriter-0.0.1-SNAPSHOT.jar:na]
        ... 3 common frames omitted
Exception in thread "taskGroup-0" com.alibaba.datax.common.exception.DataXException: Code:[HdfsWriter-04], Description:[您配置的文件在写入时出现IO异常.]. - java.lang.IndexOutOfBoundsException: Index: 54, Size: 54
        at java.util.ArrayList.rangeCheck(ArrayList.java:657)
        at java.util.ArrayList.get(ArrayList.java:433)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.transportOneRecord(HdfsHelper.java:503)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.orcFileStartWrite(HdfsHelper.java:391)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsWriter$Task.startWrite(HdfsWriter.java:416)
        at com.alibaba.datax.core.taskgroup.runner.WriterRunner.run(WriterRunner.java:56)
        at java.lang.Thread.run(Thread.java:748)
 - java.lang.IndexOutOfBoundsException: Index: 54, Size: 54
        at java.util.ArrayList.rangeCheck(ArrayList.java:657)
        at java.util.ArrayList.get(ArrayList.java:433)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.transportOneRecord(HdfsHelper.java:503)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.orcFileStartWrite(HdfsHelper.java:391)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsWriter$Task.startWrite(HdfsWriter.java:416)
        at com.alibaba.datax.core.taskgroup.runner.WriterRunner.run(WriterRunner.java:56)
        at java.lang.Thread.run(Thread.java:748)

        at com.alibaba.datax.common.exception.DataXException.asDataXException(DataXException.java:48)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.orcFileStartWrite(HdfsHelper.java:402)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsWriter$Task.startWrite(HdfsWriter.java:416)
        at com.alibaba.datax.core.taskgroup.runner.WriterRunner.run(WriterRunner.java:56)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IndexOutOfBoundsException: Index: 54, Size: 54
        at java.util.ArrayList.rangeCheck(ArrayList.java:657)
        at java.util.ArrayList.get(ArrayList.java:433)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.transportOneRecord(HdfsHelper.java:503)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.orcFileStartWrite(HdfsHelper.java:391)
        ... 3 more
2023-02-03 09:45:45.220 [job-0] INFO  StandAloneJobContainerCommunicator - Total 544 records, 139021 bytes | Speed 13.58KB/s, 54 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.001s |  All Task WaitReaderTime 0.000s | Percentage 0.00%
2023-02-03 09:45:45.221 [job-0] ERROR JobContainer - 运行scheduler 模式[standalone]出错.
2023-02-03 09:45:45.221 [job-0] ERROR JobContainer - Exception when job run
com.alibaba.datax.common.exception.DataXException: Code:[HdfsWriter-04], Description:[您配置的文件在写入时出现IO异常.]. - java.lang.IndexOutOfBoundsException: Index: 54, Size: 54
        at java.util.ArrayList.rangeCheck(ArrayList.java:657)
        at java.util.ArrayList.get(ArrayList.java:433)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.transportOneRecord(HdfsHelper.java:503)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.orcFileStartWrite(HdfsHelper.java:391)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsWriter$Task.startWrite(HdfsWriter.java:416)
        at com.alibaba.datax.core.taskgroup.runner.WriterRunner.run(WriterRunner.java:56)
        at java.lang.Thread.run(Thread.java:748)
 - java.lang.IndexOutOfBoundsException: Index: 54, Size: 54
        at java.util.ArrayList.rangeCheck(ArrayList.java:657)
        at java.util.ArrayList.get(ArrayList.java:433)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.transportOneRecord(HdfsHelper.java:503)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.orcFileStartWrite(HdfsHelper.java:391)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsWriter$Task.startWrite(HdfsWriter.java:416)
        at com.alibaba.datax.core.taskgroup.runner.WriterRunner.run(WriterRunner.java:56)
        at java.lang.Thread.run(Thread.java:748)

        at com.alibaba.datax.common.exception.DataXException.asDataXException(DataXException.java:48) ~[datax-common-0.0.1-SNAPSHOT.jar:na]
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.orcFileStartWrite(HdfsHelper.java:402) ~[na:na]
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsWriter$Task.startWrite(HdfsWriter.java:416) ~[na:na]
        at com.alibaba.datax.core.taskgroup.runner.WriterRunner.run(WriterRunner.java:56) ~[datax-core-0.0.1-SNAPSHOT.jar:na]
        at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_212]
Caused by: java.lang.IndexOutOfBoundsException: Index: 54, Size: 54
        at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[na:1.8.0_212]
        at java.util.ArrayList.get(ArrayList.java:433) ~[na:1.8.0_212]
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.transportOneRecord(HdfsHelper.java:503) ~[na:na]
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.orcFileStartWrite(HdfsHelper.java:391) ~[na:na]
        ... 3 common frames omitted
2023-02-03 09:45:45.222 [job-0] INFO  StandAloneJobContainerCommunicator - Total 544 records, 139021 bytes | Speed 135.76KB/s, 544 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.001s |  All Task WaitReaderTime 0.000s | Percentage 0.00%
2023-02-03 09:45:45.324 [job-0] ERROR Engine -

经DataX智能分析,该任务最可能的错误原因是:
com.alibaba.datax.common.exception.DataXException: Code:[HdfsWriter-04], Description:[您配置的文件在写入时出现IO异常.]. - java.lang.IndexOutOfBoundsException: Index: 54, Size: 54
        at java.util.ArrayList.rangeCheck(ArrayList.java:657)
        at java.util.ArrayList.get(ArrayList.java:433)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.transportOneRecord(HdfsHelper.java:503)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.orcFileStartWrite(HdfsHelper.java:391)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsWriter$Task.startWrite(HdfsWriter.java:416)
        at com.alibaba.datax.core.taskgroup.runner.WriterRunner.run(WriterRunner.java:56)
        at java.lang.Thread.run(Thread.java:748)
 - java.lang.IndexOutOfBoundsException: Index: 54, Size: 54
        at java.util.ArrayList.rangeCheck(ArrayList.java:657)
        at java.util.ArrayList.get(ArrayList.java:433)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.transportOneRecord(HdfsHelper.java:503)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.orcFileStartWrite(HdfsHelper.java:391)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsWriter$Task.startWrite(HdfsWriter.java:416)
        at com.alibaba.datax.core.taskgroup.runner.WriterRunner.run(WriterRunner.java:56)
        at java.lang.Thread.run(Thread.java:748)

        at com.alibaba.datax.common.exception.DataXException.asDataXException(DataXException.java:48)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.orcFileStartWrite(HdfsHelper.java:402)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsWriter$Task.startWrite(HdfsWriter.java:416)
        at com.alibaba.datax.core.taskgroup.runner.WriterRunner.run(WriterRunner.java:56)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IndexOutOfBoundsException: Index: 54, Size: 54
        at java.util.ArrayList.rangeCheck(ArrayList.java:657)
        at java.util.ArrayList.get(ArrayList.java:433)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.transportOneRecord(HdfsHelper.java:503)
        at com.alibaba.datax.plugin.writer.hdfswriter.HdfsHelper.orcFileStartWrite(HdfsHelper.java:391)
        ... 3 more

原因分析: 这是因为DataX配置job时, 使用了select xx,xx字段与下面配置的hive表字段不匹配导致的

比如我用了select * from xx表来进行查询, 但是在MySQL中修改了表结构, 导致select * 查询出来的字段比实际的要多出几个, datax 下面的writer配置column时也没有增加对应的列, 导致的.

建议修改方法, 把所有的select* 改成select字段

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值