datax操作

一、datax介绍

​ DataX 是一个异构数据源离线同步工具,致力于实现包括关系型数据库(MySQL、Oracle等)、HDFS、Hive、ODPS、HBase、FTP等各种异构数据源之间稳定高效的数据同步功能。

二、datax框架设计

三、操作

3.1、streamreader,streamwriter

{
    "job": {
        "content": [
            {
                "reader": {
                    "name": "streamreader", 
                    "parameter": {
	                    "column": [
							{
								"type":"long",
								"value":"10"
							},
							{
								"type":"string",
								"value":"hello,Datax,are you ok"
							}
						], 
                        "sliceRecordCount": "1000"
                    }
                }, 
                "writer": {
                    "name": "streamwriter", 
                    "parameter": {
                        "encoding": "UTF-8", 
                        "print": true
                    }
                }
            }
        ], 
        "setting": {
            "speed": {
                "channel": "1"
            }
        }
    }
}
D:\bigdata\datax\bin>python datax.py ../job/job3.json

DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.


2023-11-13 00:13:31.280 [main] INFO  MessageSource - JVM TimeZone: GMT+08:00, Locale: zh_CN
2023-11-13 00:13:31.281 [main] INFO  MessageSource - use Locale: zh_CN timeZone: sun.util.calendar.ZoneInfo[id="GMT+08:00",offset=28800000,dstSavings=0,useDaylight=false,transitions=0,lastRule=null]
2023-11-13 00:13:31.290 [main] INFO  VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
2023-11-13 00:13:31.293 [main] INFO  Engine - the machine info  =>

        osInfo: Windows 10 amd64 10.0
        jvmInfo:        Oracle Corporation 1.8 25.311-b11
        cpu num:        16

        totalPhysicalMemory:    -0.00G
        freePhysicalMemory:     -0.00G
        maxFileDescriptorCount: -1
        currentOpenFileDescriptorCount: -1

        GC Names        [PS MarkSweep, PS Scavenge]

        MEMORY_NAME                    | allocation_size                | init_size
        PS Eden Space                  | 256.00MB                       | 256.00MB
        Code Cache                     | 240.00MB                       | 2.44MB
        Compressed Class Space         | 1,024.00MB                     | 0.00MB
        PS Survivor Space              | 42.50MB                        | 42.50MB
        PS Old Gen                     | 683.00MB                       | 683.00MB
        Metaspace                      | -0.00MB                        | 0.00MB


2023-11-13 00:13:31.310 [main] INFO  Engine -
{
        "content":[
                {
                        "reader":{
                                "name":"streamreader",
                                "parameter":{
                                        "column":[
                                                {
                                                        "type":"long",
                                                        "value":"10"
                                                },
                                                {
                                                        "type":"string",
                                                        "value":"hello,Datax,are you ok"
                                                }
                                        ],
                                        "sliceRecordCount":"10"
                                }
                        },
                        "writer":{
                                "name":"streamwriter",
                                "parameter":{
                                        "encoding":"UTF-8",
                                        "print":true
                                }
                        }
                }
        ],
        "setting":{
                "speed":{
                        "channel":"1"
                }
        }
}

2023-11-13 00:13:31.326 [main] INFO  PerfTrace - PerfTrace traceId=job_-1, isEnable=false
2023-11-13 00:13:31.326 [main] INFO  JobContainer - DataX jobContainer starts job.
2023-11-13 00:13:31.327 [main] INFO  JobContainer - Set jobId = 0
2023-11-13 00:13:31.335 [job-0] INFO  JobContainer - jobContainer starts to do prepare ...
2023-11-13 00:13:31.335 [job-0] INFO  JobContainer - DataX Reader.Job [streamreader] do prepare work .
2023-11-13 00:13:31.336 [job-0] INFO  JobContainer - DataX Writer.Job [streamwriter] do prepare work .
2023-11-13 00:13:31.349 [job-0] INFO  JobContainer - jobContainer starts to do split ...
2023-11-13 00:13:31.349 [job-0] INFO  JobContainer - Job set Channel-Number to 1 channels.
2023-11-13 00:13:31.362 [job-0] INFO  JobContainer - DataX Reader.Job [streamreader] splits to [1] tasks.
2023-11-13 00:13:31.363 [job-0] INFO  JobContainer - DataX Writer.Job [streamwriter] splits to [1] tasks.
2023-11-13 00:13:31.377 [job-0] INFO  JobContainer - jobContainer starts to do schedule ...
2023-11-13 00:13:31.379 [job-0] INFO  JobContainer - Scheduler starts [1] taskGroups.
2023-11-13 00:13:31.389 [job-0] INFO  JobContainer - Running by standalone Mode.
2023-11-13 00:13:31.407 [taskGroup-0] INFO  TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.
2023-11-13 00:13:31.411 [taskGroup-0] INFO  Channel - Channel set byte_speed_limit to -1, No bps activated.
2023-11-13 00:13:31.412 [taskGroup-0] INFO  Channel - Channel set record_speed_limit to -1, No tps activated.
2023-11-13 00:13:31.434 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started
10      hello,Datax,are you ok
10      hello,Datax,are you ok
10      hello,Datax,are you ok
10      hello,Datax,are you ok
10      hello,Datax,are you ok
10      hello,Datax,are you ok
10      hello,Datax,are you ok
10      hello,Datax,are you ok
10      hello,Datax,are you ok
10      hello,Datax,are you ok
2023-11-13 00:13:31.537 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[104]ms
2023-11-13 00:13:31.539 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] completed it's tasks.
2023-11-13 00:13:41.401 [job-0] INFO  StandAloneJobContainerCommunicator - Total 10 records, 240 bytes | Speed 24B/s, 1 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.000s |  All Task WaitReaderTime 0.000s | Percentage 100.00%
2023-11-13 00:13:41.401 [job-0] INFO  AbstractScheduler - Scheduler accomplished all tasks.
2023-11-13 00:13:41.414 [job-0] INFO  JobContainer - DataX Writer.Job [streamwriter] do post work.
2023-11-13 00:13:41.422 [job-0] INFO  JobContainer - DataX Reader.Job [streamreader] do post work.
2023-11-13 00:13:41.423 [job-0] INFO  JobContainer - DataX jobId [0] completed successfully.
2023-11-13 00:13:41.424 [job-0] INFO  HookInvoker - No hook invoked, because base dir not exists or is a file: D:\bigdata\datax\hook
2023-11-13 00:13:41.425 [job-0] INFO  JobContainer -
         [total cpu info] =>
                averageCpu                     | maxDeltaCpu                    | minDeltaCpu
                -1.00%                         | -1.00%                         | -1.00%


         [total gc info] =>
                 NAME                 | totalGCCount       | maxDeltaGCCount    | minDeltaGCCount    | totalGCTime        | maxDeltaGCTime     | minDeltaGCTime
                 PS MarkSweep         | 0                  | 0                  | 0                  | 0.000s             | 0.000s             | 0.000s
                 PS Scavenge          | 0                  | 0                  | 0                  | 0.000s             | 0.000s             | 0.000s

2023-11-13 00:13:41.426 [job-0] INFO  JobContainer - PerfTrace not enable!
2023-11-13 00:13:41.426 [job-0] INFO  StandAloneJobContainerCommunicator - Total 10 records, 240 bytes | Speed 24B/s, 1 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.000s |  All Task WaitReaderTime 0.000s | Percentage 100.00%
2023-11-13 00:13:41.427 [job-0] INFO  JobContainer -
任务启动时刻                    : 2023-11-13 00:13:31
任务结束时刻                    : 2023-11-13 00:13:41
任务总计耗时                    :                 10s
任务平均流量                    :               24B/s
记录写入速度                    :              1rec/s
读出记录总数                    :                  10
读写失败总数                    :                   0

3.2、mysqlreader,mysqlwriter

{
    "job": {
        "content": [
            {
                "reader": {
                    "name": "mysqlreader",
                    "parameter": {
                        "username": "root",
                        "password": "123456",
                        "where": "data_time >= create_time and data_time  < end_time",
                        "column": [
                           "id","data_time","name","age","insert_time","create_time","end_time"
                        ],
                        "connection": [
                            {
                                "table": [
                                    "user"
                                ],
                                "jdbcUrl": [
                                    "jdbc:mysql://127.0.0.1:3306/dashboard?useUnicode=true&characterEncoding=utf8&useSSL=false"
                                ]
                            }
                        ]
                    }
                },
                "writer": {
                    "name": "mysqlwriter",
                    "parameter": {
                        "writeMode": "update",
                        "username": "root",
                        "password": "123456",
                        "column": [
                            "id","data_time","name","age","insert_time","create_time","end_time"
                        ],
                        "connection": [
                            {
                                "jdbcUrl": "jdbc:mysql://127.0.0.1:3306/dashboard?useUnicode=true&characterEncoding=utf8&useSSL=false",
                                "table": [
                                    "user1"
                                ]
                            }
                        ]
                    }
                }
            }
        ],
        "setting": {
            "speed": {
                "channel": 6
            }
        }
    }
}
D:\bigdata\datax\bin>python datax.py ../job/job1.json

DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.


2023-11-13 00:14:57.892 [main] INFO  MessageSource - JVM TimeZone: GMT+08:00, Locale: zh_CN
2023-11-13 00:14:57.894 [main] INFO  MessageSource - use Locale: zh_CN timeZone: sun.util.calendar.ZoneInfo[id="GMT+08:00",offset=28800000,dstSavings=0,useDaylight=false,transitions=0,lastRule=null]
2023-11-13 00:14:57.900 [main] INFO  VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
2023-11-13 00:14:57.903 [main] INFO  Engine - the machine info  =>

        osInfo: Windows 10 amd64 10.0
        jvmInfo:        Oracle Corporation 1.8 25.311-b11
        cpu num:        16

        totalPhysicalMemory:    -0.00G
        freePhysicalMemory:     -0.00G
        maxFileDescriptorCount: -1
        currentOpenFileDescriptorCount: -1

        GC Names        [PS MarkSweep, PS Scavenge]

        MEMORY_NAME                    | allocation_size                | init_size
        PS Eden Space                  | 256.00MB                       | 256.00MB
        Code Cache                     | 240.00MB                       | 2.44MB
        Compressed Class Space         | 1,024.00MB                     | 0.00MB
        PS Survivor Space              | 42.50MB                        | 42.50MB
        PS Old Gen                     | 683.00MB                       | 683.00MB
        Metaspace                      | -0.00MB                        | 0.00MB


2023-11-13 00:14:57.915 [main] INFO  Engine -
{
        "content":[
                {
                        "reader":{
                                "name":"mysqlreader",
                                "parameter":{
                                        "username":"root",
                                        "password":"******",
                                        "where":"data_time >= create_time and data_time  < end_time",
                                        "column":[
                                                "id",
                                                "data_time",
                                                "name",
                                                "age",
                                                "insert_time",
                                                "create_time",
                                                "end_time"
                                        ],
                                        "connection":[
                                                {
                                                        "table":[
                                                                "user"
                                                        ],
                                                        "jdbcUrl":[
                                                                "jdbc:mysql://127.0.0.1:3306/dashboard?useUnicode=true&characterEncoding=utf8&useSSL=false"
                                                        ]
                                                }
                                        ]
                                }
                        },
                        "writer":{
                                "name":"mysqlwriter",
                                "parameter":{
                                        "writeMode":"update",
                                        "username":"root",
                                        "password":"******",
                                        "column":[
                                                "id",
                                                "data_time",
                                                "name",
                                                "age",
                                                "insert_time",
                                                "create_time",
                                                "end_time"
                                        ],
                                        "connection":[
                                                {
                                                        "jdbcUrl":"jdbc:mysql://127.0.0.1:3306/dashboard?useUnicode=true&characterEncoding=utf8&useSSL=false",
                                                        "table":[
                                                                "user1"
                                                        ]
                                                }
                                        ]
                                }
                        }
                }
        ],
        "setting":{
                "speed":{
                        "channel":6
                }
        }
}

2023-11-13 00:14:57.925 [main] INFO  PerfTrace - PerfTrace traceId=job_-1, isEnable=false
2023-11-13 00:14:57.934 [main] INFO  JobContainer - DataX jobContainer starts job.
2023-11-13 00:14:57.935 [main] INFO  JobContainer - Set jobId = 0
2023-11-13 00:14:58.124 [job-0] INFO  OriginalConfPretreatmentUtil - Available jdbcUrl:jdbc:mysql://127.0.0.1:3306/dashboard?useUnicode=true&characterEncoding=utf8&useSSL=false&yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true.
2023-11-13 00:14:58.131 [job-0] INFO  OriginalConfPretreatmentUtil - table:[user] has columns:[id,data_time,name,age,insert_time,create_time,end_time].
2023-11-13 00:14:58.255 [job-0] INFO  OriginalConfPretreatmentUtil - table:[user1] all columns:[
id,data_time,name,age,insert_time,create_time,end_time
].
2023-11-13 00:14:58.259 [job-0] INFO  OriginalConfPretreatmentUtil - Write data [
INSERT INTO %s (id,data_time,name,age,insert_time,create_time,end_time) VALUES(?,?,?,?,?,?,?) ON DUPLICATE KEY UPDATE id=VALUES(id),data_time=VALUES(data_time),name=VALUES(name),age=VALUES(age),insert_time=VALUES(insert_time),create_time=VALUES(create_time),end_time=VALUES(end_time)
], which jdbcUrl like:[jdbc:mysql://127.0.0.1:3306/dashboard?useUnicode=true&characterEncoding=utf8&useSSL=false&yearIsDateType=false&zeroDateTimeBehavior=convertToNull&rewriteBatchedStatements=true&tinyInt1isBit=false]
2023-11-13 00:14:58.274 [job-0] INFO  JobContainer - jobContainer starts to do prepare ...
2023-11-13 00:14:58.292 [job-0] INFO  JobContainer - DataX Reader.Job [mysqlreader] do prepare work .
2023-11-13 00:14:58.292 [job-0] INFO  JobContainer - DataX Writer.Job [mysqlwriter] do prepare work .
2023-11-13 00:14:58.294 [job-0] INFO  JobContainer - jobContainer starts to do split ...
2023-11-13 00:14:58.294 [job-0] INFO  JobContainer - Job set Channel-Number to 6 channels.
2023-11-13 00:14:58.297 [job-0] INFO  JobContainer - DataX Reader.Job [mysqlreader] splits to [1] tasks.
2023-11-13 00:14:58.298 [job-0] INFO  JobContainer - DataX Writer.Job [mysqlwriter] splits to [1] tasks.
2023-11-13 00:14:58.312 [job-0] INFO  JobContainer - jobContainer starts to do schedule ...
2023-11-13 00:14:58.313 [job-0] INFO  JobContainer - Scheduler starts [1] taskGroups.
2023-11-13 00:14:58.318 [job-0] INFO  JobContainer - Running by standalone Mode.
2023-11-13 00:14:58.337 [taskGroup-0] INFO  TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.
2023-11-13 00:14:58.341 [taskGroup-0] INFO  Channel - Channel set byte_speed_limit to -1, No bps activated.
2023-11-13 00:14:58.341 [taskGroup-0] INFO  Channel - Channel set record_speed_limit to -1, No tps activated.
2023-11-13 00:14:58.363 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started
2023-11-13 00:14:58.366 [0-0-0-reader] INFO  CommonRdbmsReader$Task - Begin to read record by Sql: [select id,data_time,name,age,insert_time,create_time,end_time from user where (data_time >= create_time and data_time  < end_time)
] jdbcUrl:[jdbc:mysql://127.0.0.1:3306/cig_dashboard?useUnicode=true&characterEncoding=utf8&useSSL=false&yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true].
2023-11-13 00:15:03.590 [0-0-0-reader] INFO  CommonRdbmsReader$Task - Finished read record by Sql: [select id,data_time,name,age,insert_time,create_time,end_time from user where (data_time >= create_time and data_time  < end_time)
] jdbcUrl:[jdbc:mysql://127.0.0.1:3306/cig_dashboard?useUnicode=true&characterEncoding=utf8&useSSL=false&yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true].
2023-11-13 00:15:03.643 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[5281]ms
2023-11-13 00:15:03.644 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] completed it's tasks.
2023-11-13 00:15:08.331 [job-0] INFO  StandAloneJobContainerCommunicator - Total 393216 records, 17340155 bytes | Speed 1.65MB/s, 39321 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 4.545s |  All Task WaitReaderTime 0.500s | Percentage 100.00%
2023-11-13 00:15:08.332 [job-0] INFO  AbstractScheduler - Scheduler accomplished all tasks.
2023-11-13 00:15:08.333 [job-0] INFO  JobContainer - DataX Writer.Job [mysqlwriter] do post work.
2023-11-13 00:15:08.333 [job-0] INFO  JobContainer - DataX Reader.Job [mysqlreader] do post work.
2023-11-13 00:15:08.333 [job-0] INFO  JobContainer - DataX jobId [0] completed successfully.
2023-11-13 00:15:08.334 [job-0] INFO  HookInvoker - No hook invoked, because base dir not exists or is a file: D:\bigdata\datax\hook
2023-11-13 00:15:08.335 [job-0] INFO  JobContainer -
         [total cpu info] =>
                averageCpu                     | maxDeltaCpu                    | minDeltaCpu
                -1.00%                         | -1.00%                         | -1.00%


         [total gc info] =>
                 NAME                 | totalGCCount       | maxDeltaGCCount    | minDeltaGCCount    | totalGCTime        | maxDeltaGCTime     | minDeltaGCTime
                 PS MarkSweep         | 0                  | 0                  | 0                  | 0.000s             | 0.000s             | 0.000s
                 PS Scavenge          | 11                 | 11                 | 11                 | 0.040s             | 0.040s             | 0.040s

2023-11-13 00:15:08.335 [job-0] INFO  JobContainer - PerfTrace not enable!
2023-11-13 00:15:08.336 [job-0] INFO  StandAloneJobContainerCommunicator - Total 393216 records, 17340155 bytes | Speed 1.65MB/s, 39321 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 4.545s |  All Task WaitReaderTime 0.500s | Percentage 100.00%
2023-11-13 00:15:08.337 [job-0] INFO  JobContainer -
任务启动时刻                    : 2023-11-13 00:14:57
任务结束时刻                    : 2023-11-13 00:15:08
任务总计耗时                    :                 10s
任务平均流量                    :            1.65MB/s
记录写入速度                    :          39321rec/s
读出记录总数                    :              393216
读写失败总数                    :                   0

3.3、mysqlreader,txtfilewriter

{
    "job": {
        "content": [
            {
                "reader": {
                    "name": "mysqlreader",
                    "parameter": {
                        "username": "root",
                        "password": "123456",
                        "where": "data_time >= create_time and data_time  < end_time",
                        "column": [
                           "id","data_time","name","age","insert_time","create_time","end_time"
                        ],
                        "connection": [
                            {
                                "table": [
                                    "user"
                                ],
                                "jdbcUrl": [
                                    "jdbc:mysql://127.0.0.1:3306/dashboard?useUnicode=true&characterEncoding=utf8&useSSL=false"
                                ]
                            }
                        ]
                    }
                },
                "writer": {
                    "name": "txtfilewriter",
                    "parameter": {
                        "path": "D:/test/",
                        "fileName": "abc.txt",
                        "writeMode": "truncate",
                        "dateFormat": "yyyy-MM-dd hh:mm:ss"
                    }
                }
            }
        ],
        "setting": {
            "speed": {
                "channel": 6
            }
        }
    }
}

 

D:\bigdata\datax\bin>python datax.py ../job/job2.json

DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.


2023-11-13 00:16:49.060 [main] INFO  MessageSource - JVM TimeZone: GMT+08:00, Locale: zh_CN
2023-11-13 00:16:49.061 [main] INFO  MessageSource - use Locale: zh_CN timeZone: sun.util.calendar.ZoneInfo[id="GMT+08:00",offset=28800000,dstSavings=0,useDaylight=false,transitions=0,lastRule=null]
2023-11-13 00:16:49.068 [main] INFO  VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
2023-11-13 00:16:49.071 [main] INFO  Engine - the machine info  =>

        osInfo: Windows 10 amd64 10.0
        jvmInfo:        Oracle Corporation 1.8 25.311-b11
        cpu num:        16

        totalPhysicalMemory:    -0.00G
        freePhysicalMemory:     -0.00G
        maxFileDescriptorCount: -1
        currentOpenFileDescriptorCount: -1

        GC Names        [PS MarkSweep, PS Scavenge]

        MEMORY_NAME                    | allocation_size                | init_size
        PS Eden Space                  | 256.00MB                       | 256.00MB
        Code Cache                     | 240.00MB                       | 2.44MB
        Compressed Class Space         | 1,024.00MB                     | 0.00MB
        PS Survivor Space              | 42.50MB                        | 42.50MB
        PS Old Gen                     | 683.00MB                       | 683.00MB
        Metaspace                      | -0.00MB                        | 0.00MB


2023-11-13 00:16:49.082 [main] INFO  Engine -
{
        "content":[
                {
                        "reader":{
                                "name":"mysqlreader",
                                "parameter":{
                                        "username":"root",
                                        "password":"******",
                                        "where":"data_time >= create_time and data_time  < end_time",
                                        "column":[
                                                "id",
                                                "data_time",
                                                "name",
                                                "age",
                                                "insert_time",
                                                "create_time",
                                                "end_time"
                                        ],
                                        "connection":[
                                                {
                                                        "table":[
                                                                "user"
                                                        ],
                                                        "jdbcUrl":[
                                                                "jdbc:mysql://127.0.0.1:3306/dashboard?useUnicode=true&characterEncoding=utf8&useSSL=false"
                                                        ]
                                                }
                                        ]
                                }
                        },
                        "writer":{
                                "name":"txtfilewriter",
                                "parameter":{
                                        "path":"D:/test/",
                                        "fileName":"abc.txt",
                                        "writeMode":"truncate",
                                        "dateFormat":"yyyy-MM-dd hh:mm:ss"
                                }
                        }
                }
        ],
        "setting":{
                "speed":{
                        "channel":6
                }
        }
}

2023-11-13 00:16:49.093 [main] INFO  PerfTrace - PerfTrace traceId=job_-1, isEnable=false
2023-11-13 00:16:49.099 [main] INFO  JobContainer - DataX jobContainer starts job.
2023-11-13 00:16:49.100 [main] INFO  JobContainer - Set jobId = 0
2023-11-13 00:16:49.292 [job-0] INFO  OriginalConfPretreatmentUtil - Available jdbcUrl:jdbc:mysql://127.0.0.1:3306/dashboard?useUnicode=true&characterEncoding=utf8&useSSL=false&yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true.
2023-11-13 00:16:49.297 [job-0] INFO  OriginalConfPretreatmentUtil - table:[user] has columns:[id,data_time,name,age,insert_time,create_time,end_time].
2023-11-13 00:16:49.324 [job-0] INFO  JobContainer - jobContainer starts to do prepare ...
2023-11-13 00:16:49.325 [job-0] INFO  JobContainer - DataX Reader.Job [mysqlreader] do prepare work .
2023-11-13 00:16:49.333 [job-0] INFO  JobContainer - DataX Writer.Job [txtfilewriter] do prepare work .
2023-11-13 00:16:49.333 [job-0] INFO  TxtFileWriter$Job - 由于您配置了writeMode truncate, 开始清理 [D:/test/] 下面以 [abc.txt] 开头的内容
2023-11-13 00:16:49.337 [job-0] INFO  JobContainer - jobContainer starts to do split ...
2023-11-13 00:16:49.339 [job-0] INFO  JobContainer - Job set Channel-Number to 6 channels.
2023-11-13 00:16:49.342 [job-0] INFO  JobContainer - DataX Reader.Job [mysqlreader] splits to [1] tasks.
2023-11-13 00:16:49.352 [job-0] INFO  TxtFileWriter$Job - begin do split...
2023-11-13 00:16:54.357 [job-0] INFO  TxtFileWriter$Job - splited write file name:[abc.txt__9df0b63f_c506_430e_85fb_dc659aa280b6]
2023-11-13 00:16:54.358 [job-0] INFO  TxtFileWriter$Job - end do split.
2023-11-13 00:16:54.363 [job-0] INFO  JobContainer - DataX Writer.Job [txtfilewriter] splits to [1] tasks.
2023-11-13 00:16:54.377 [job-0] INFO  JobContainer - jobContainer starts to do schedule ...
2023-11-13 00:16:54.382 [job-0] INFO  JobContainer - Scheduler starts [1] taskGroups.
2023-11-13 00:16:54.383 [job-0] INFO  JobContainer - Running by standalone Mode.
2023-11-13 00:16:54.401 [taskGroup-0] INFO  TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.
2023-11-13 00:16:54.403 [taskGroup-0] INFO  Channel - Channel set byte_speed_limit to -1, No bps activated.
2023-11-13 00:16:54.405 [taskGroup-0] INFO  Channel - Channel set record_speed_limit to -1, No tps activated.
2023-11-13 00:16:54.411 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started
2023-11-13 00:16:54.412 [0-0-0-writer] INFO  TxtFileWriter$Task - begin do write...
2023-11-13 00:16:54.413 [0-0-0-reader] INFO  CommonRdbmsReader$Task - Begin to read record by Sql: [select id,data_time,name,age,insert_time,create_time,end_time from user where (data_time >= create_time and data_time  < end_time)
] jdbcUrl:[jdbc:mysql://127.0.0.1:3306/dashboard?useUnicode=true&characterEncoding=utf8&useSSL=false&yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true].
2023-11-13 00:16:54.424 [0-0-0-writer] INFO  TxtFileWriter$Task - write to file : [D:/test/\abc.txt__9df0b63f_c506_430e_85fb_dc659aa280b6]
2023-11-13 00:16:55.657 [0-0-0-reader] INFO  CommonRdbmsReader$Task - Finished read record by Sql: [select id,data_time,name,age,insert_time,create_time,end_time from user where (data_time >= create_time and data_time  < end_time)
] jdbcUrl:[jdbc:mysql://127.0.0.1:3306/cig_dashboard?useUnicode=true&characterEncoding=utf8&useSSL=false&yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true].
2023-11-13 00:16:55.660 [0-0-0-writer] INFO  TxtFileWriter$Task - end do write
2023-11-13 00:16:55.740 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[1330]ms
2023-11-13 00:16:55.741 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] completed it's tasks.
2023-11-13 00:17:04.419 [job-0] INFO  StandAloneJobContainerCommunicator - Total 393216 records, 17340155 bytes | Speed 1.65MB/s, 39321 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.515s |  All Task WaitReaderTime 0.016s | Percentage 100.00%
2023-11-13 00:17:04.420 [job-0] INFO  AbstractScheduler - Scheduler accomplished all tasks.
2023-11-13 00:17:04.425 [job-0] INFO  JobContainer - DataX Writer.Job [txtfilewriter] do post work.
2023-11-13 00:17:04.440 [job-0] INFO  JobContainer - DataX Reader.Job [mysqlreader] do post work.
2023-11-13 00:17:04.442 [job-0] INFO  JobContainer - DataX jobId [0] completed successfully.
2023-11-13 00:17:04.443 [job-0] INFO  HookInvoker - No hook invoked, because base dir not exists or is a file: D:\bigdata\datax\hook
2023-11-13 00:17:04.461 [job-0] INFO  JobContainer -
         [total cpu info] =>
                averageCpu                     | maxDeltaCpu                    | minDeltaCpu
                -1.00%                         | -1.00%                         | -1.00%


         [total gc info] =>
                 NAME                 | totalGCCount       | maxDeltaGCCount    | minDeltaGCCount    | totalGCTime        | maxDeltaGCTime     | minDeltaGCTime
                 PS MarkSweep         | 0                  | 0                  | 0                  | 0.000s             | 0.000s             | 0.000s
                 PS Scavenge          | 10                 | 10                 | 10                 | 0.029s             | 0.029s             | 0.029s

2023-11-13 00:17:04.462 [job-0] INFO  JobContainer - PerfTrace not enable!
2023-11-13 00:17:04.463 [job-0] INFO  StandAloneJobContainerCommunicator - Total 393216 records, 17340155 bytes | Speed 1.65MB/s, 39321 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.515s |  All Task WaitReaderTime 0.016s | Percentage 100.00%
2023-11-13 00:17:04.481 [job-0] INFO  JobContainer -
任务启动时刻                    : 2023-11-13 00:16:49
任务结束时刻                    : 2023-11-13 00:17:04
任务总计耗时                    :                 15s
任务平均流量                    :            1.65MB/s
记录写入速度                    :          39321rec/s
读出记录总数                    :              393216
读写失败总数                    :                   0

txt里面的数据是

1,2023-11-12 02:58:54,abc1,18,2023-11-12 02:58:54,2023-11-06 02:58:54,2023-11-20 02:58:54
8,2023-11-12 02:58:58,abc8,18,2023-11-12 02:58:58,2023-11-12 02:58:58,2023-11-13 02:58:58
10,2023-11-12 02:58:58,abc10,18,2023-11-12 02:58:58,2023-11-12 01:58:58,2023-11-20 02:58:58
14,2023-11-12 02:58:54,abc1,18,2023-11-12 02:58:54,2023-11-06 02:58:54,2023-11-20 02:58:54
15,2023-11-12 02:58:58,abc8,18,2023-11-12 02:58:58,2023-11-12 02:58:58,2023-11-13 02:58:58
16,2023-11-12 02:58:58,abc10,18,2023-11-12 02:58:58,2023-11-12 01:58:58,2023-11-20 02:58:58
17,2023-11-12 02:58:54,abc1,18,2023-11-12 02:58:54,2023-11-06 02:58:54,2023-11-20 02:58:54
18,2023-11-12 02:58:58,abc8,18,2023-11-12 02:58:58,2023-11-12 02:58:58,2023-11-13 02:58:58
19,2023-11-12 02:58:58,abc10,18,2023-11-12 02:58:58,2023-11-12 01:58:58,2023-11-20 02:58:58
20,2023-11-12 02:58:54,abc1,18,2023-11-12 02:58:54,2023-11-06 02:58:54,2023-11-20 02:58:54
21,2023-11-12 02:58:58,abc8,18,2023-11-12 02:58:58,2023-11-12 02:58:58,2023-11-13 02:58:58
22,2023-11-12 02:58:58,abc10,18,2023-11-12 02:58:58,2023-11-12 01:58:58,2023-11-20 02:58:58
24,2023-11-12 02:58:54,abc1,18,2023-11-12 02:58:54,2023-11-06 02:58:54,2023-11-20 02:58:54
25,2023-11-12 02:58:58,abc8,18,2023-11-12 02:58:58,2023-11-12 02:58:58,2023-11-13 02:58:58
26,2023-11-12 02:58:58,abc10,18,2023-11-12 02:58:58,2023-11-12 01:58:58,2023-11-20 02:58:58
27,2023-11-12 02:58:54,abc1,18,2023-11-12 02:58:54,2023-11-06 02:58:54,2023-11-20 02:58:54
28,2023-11-12 02:58:58,abc8,18,2023-11-12 02:58:58,2023-11-12 02:58:58,2023-11-13 02:58:58
29,2023-11-12 02:58:58,abc10,18,2023-11-12 02:58:58,2023-11-12 01:58:58,2023-11-20 02:58:58
30,2023-11-12 02:58:54,abc1,18,2023-11-12 02:58:54,2023-11-06 02:58:54,2023-11-20 02:58:54
31,2023-11-12 02:58:58,abc8,18,2023-11-12 02:58:58,2023-11-12 02:58:58,2023-11-13 02:58:58
32,2023-11-12 02:58:58,abc10,18,2023-11-12 02:58:58,2023-11-12 01:58:58,2023-11-20 02:58:58
33,2023-11-12 02:58:54,abc1,18,2023-11-12 02:58:54,2023-11-06 02:58:54,2023-11-20 02:58:54
34,2023-11-12 02:58:58,abc8,18,2023-11-12 02:58:58,2023-11-12 02:58:58,2023-11-13 02:58:58
35,2023-11-12 02:58:58,abc10,18,2023-11-12 02:58:58,2023-11-12 01:58:58,2023-11-20 02:58:58
39,2023-11-12 02:58:54,abc1,18,2023-11-12 02:58:54,2023-11-06 02:58:54,2023-11-20 02:58:54
40,2023-11-12 02:58:58,abc8,18,2023-11-12 02:58:58,2023-11-12 02:58:58,2023-11-13 02:58:58
41,2023-11-12 02:58:58,abc10,18,2023-11-12 02:58:58,2023-11-12 01:58:58,2023-11-20 02:58:58
42,2023-11-12 02:58:54,abc1,18,2023-11-12 02:58:54,2023-11-06 02:58:54,2023-11-20 02:58:54
43,2023-11-12 02:58:58,abc8,18,2023-11-12 02:58:58,2023-11-12 02:58:58,2023-11-13 02:58:58
44,2023-11-12 02:58:58,abc10,18,2023-11-12 02:58:58,2023-11-12 01:58:58,2023-11-20 02:58:58
45,2023-11-12 02:58:54,abc1,18,2023-11-12 02:58:54,2023-11-06 02:58:54,2023-11-20 02:58:54
46,2023-11-12 02:58:58,abc8,18,2023-11-12 02:58:58,2023-11-12 02:58:58,2023-11-13 02:58:58
47,2023-11-12 02:58:58,abc10,18,2023-11-12 02:58:58,2023-11-12 01:58:58,2023-11-20 02:58:58
48,2023-11-12 02:58:54,abc1,18,2023-11-12 02:58:54,2023-11-06 02:58:54,2023-11-20 02:58:54
49,2023-11-12 02:58:58,abc8,18,2023-11-12 02:58:58,2023-11-12 02:58:58,2023-11-13 02:58:58

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值