记录一次datax同步数据到doris错误[DATA_QUALITY_ERROR]Encountered unqualified data

在使用DataX从MySQL同步数据到Doris时,遇到DATA_QUALITY_ERROR错误,提示遇到了不合格的数据导致同步失败。错误信息中没有提供详细的错误链接,经过排查发现是字段长度超出限制导致的问题。通过逐个字段测试,最终定位并解决了问题。
摘要由CSDN通过智能技术生成

使用data同步数据,mysql到doris数据库,遇到过没有详细报错的情况,之前一般遇到字段长度或类型转换错误它是有详细错误的一个链接呈现错误原因的,找了很久没找到原因。

错误提示:

[2024-04-10 14:51:38.057][ERROR] >>>> 运行scheduler 模式[standalone]出错.

com.alibaba.datax.common.exception.DataXException: Code:[DBUtilErrorCode-05], Description:[往您配置的写入表中写入数据时失败.]. - java.io.IOException: Failed to flush data to doris.

{"Status":"Fail","Comment":"","BeginTxnTimeMs":1,"Message":"[CANCELLED][DATA_QUALITY_ERROR]Encountered unqualified data, stop processing","NumberUnselectedRows":0,"CommitAndPublishTimeMs":0,"Label":"datax_doris_writer_580a53cc-7fe0-45cc-9142-acac312cdcbc_0","LoadBytes":2455041,"StreamLoadPutTimeMs":6,"NumberTotalRows":0,"WriteDataTimeMs":310,"TxnId":2038723,"LoadTimeMs":319,"TwoPhaseCommit":"false","ReadDataTimeMs":10,"NumberLoadedRows":0,"NumberFilteredRows":0}

at com.alibaba.datax.plugin.writer.doriswriter.DorisWriterEmitter.doStreamLoad(DorisWriterEmitter.java:114)

at com.alibaba.datax.plugin.writer.doriswriter.DorisWriter$Task.flush(DorisWriter.java:123)

at com.alibaba.datax.plugin.writer.doriswriter.DorisWriter$Task.startWrite(DorisWriter.java:113)

at com.alibaba.datax.core.taskgroup.runner.WriterRunner.run(WriterRunner.java:56)

at java.base/java.lang.Thread.run(Thread.java:834)

- java.io.IOException: Failed to flush data to doris.

{"Status":"Fail","Comment":"","BeginTxnTimeMs":1,"Message":"[CANCELLED][DATA_QUALITY_ERROR]Encountered unqualified data, stop processing","NumberUnselectedRows":0,"CommitAndPublishTimeMs":0,"Label":"datax_doris_writer_580a53cc-7fe0-45cc-9142-acac312cdcbc_0","LoadBytes":2455041,"StreamLoadPutTimeMs":6,"NumberTotalRows":0,"WriteDataTimeMs":310,"TxnId":2038723,"LoadTimeMs":319,"TwoPhaseCommit":"false","ReadDataTimeMs":10,"NumberLoadedRows":0,"NumberFilteredRows":0}

at com.alibaba.datax.plugin.writer.doriswriter.DorisWriterEmitter.doStreamLoad(DorisWriterEmitter.java:114)

at com.alibaba.datax.plugin.writer.doriswriter.DorisWriter$Task.flush(DorisWriter.java:123)

at com.alibaba.datax.plugin.writer.doriswriter.DorisWriter$Task.s

  • 23
    点赞
  • 14
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
DataX 是阿里巴巴集团开源的一款数据同步工具,它支持多种数据源之间的同步,包括 Hive、MySQL、Oracle 等常见关系型数据库以及 HDFS、FTP 等非关系型数据源。下面介绍如何使用 DataX 将 Hive 和 MySQL 中的表互导。 首先需要在本地和远程服务器上安装好 DataX。然后创建一个名为 job.json 的配置文件,内容如下: ``` { "job": { "content": [ { "reader": { "name": "hdfswriter", "parameter": { "path": "/user/hive/warehouse/myhive.db/myhive_table", "defaultFS": "hdfs://localhost:9000", "fileType": "orc", "column": [ "col1", "col2", "col3" ] } }, "writer": { "name": "mysqlwriter", "parameter": { "url": "jdbc:mysql://localhost:3306/test", "username": "root", "password": "123456", "table": "mysql_table", "column": [ "col1", "col2", "col3" ] } } } ], "setting": { "speed": { "channel": "3" } } } } ``` 该配置文件定义了一个数据同步任务,将 Hive 中的 myhive_table 表导入到 MySQL 中的 mysql_table 表中。其中,hdfswriter 和 mysqlwriter 分别表示数据源和目标源的类型,parameter 参数中定义了数据源和目标源的详细信息。column 参数定义了需要同步的列。 执行以下命令启动 DataX 任务: ``` python datax.py job.json ``` 该命令会按照配置文件中的定义开始数据同步任务。在数据量较大的情况下,可以通过修改 job.json 中的 speed 参数来调整数据同步的速度,以避免对源和目标服务器的负载过大。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值