datax下载地址:
DataX下载地址https://github.com/alibaba/DataX/blob/master/userGuid.md
准备环境:
1.JDK(1.8以上,推荐1.8)
2.Python(2或3都可以)
3.Apache Maven 3.x (Compile DataX) 非编译安装不需要
解压安装包,记住解压的路径地址,(我的是在E:\datax),进入E:\datax\bin目录,记事本打开datax.py文件,修改"DATA_HOME=..."这里的路径为datax根目录 E:\datax。
然后找到E:\datax\conf目录下的 ".json" 文件,我这里是core.json,记事本打开,修改里面的内容为:
{
"job": {
"content": [
{
"reader": {
"name": "streamreader",
"parameter": {
"sliceRecordCount": 2,
"column": [
{
"type": "long",
"value": "10"
},
{
"type": "string",
"value": "hello,你好,世界-DataX"
}
]
}
},
"writer": {
"name": "streamwriter",
"parameter": {
"encoding": "UTF-8",
"print": true
}
}
}
],
"setting": {
"speed": {
"channel": 5
}
}
},
"entry": {
"jvm": "-Xms1G -Xmx1G",
"environment": {}
},
"common": {
"column": {
"datetimeFormat": "yyyy-MM-dd HH:mm:ss",
"timeFormat": "HH:mm:ss",
"dateFormat": "yyyy-MM-dd",
"extraFormats": ["yyyyMMdd"],
"timeZone": "GMT+8",
"encoding": "utf-8"
}
},
"core": {
"dataXServer": {
"address": "http://localhost:7001/api",
"timeout": 10000,
"reportDataxLog": false,
"reportPerfLog": false
},
"transport": {
"channel": {
"class": "com.alibaba.datax.core.transport.channel.memory.MemoryChannel",
"speed": {
"byte": -1,
"record": -1
},
"flowControlInterval": 20,
"capacity": 512,
"byteCapacity": 67108864
},
"exchanger": {
"class": "com.alibaba.datax.core.plugin.BufferedRecordExchanger",
"bufferSize": 32
}
},
"container": {
"job": {
"reportInterval": 10000
},
"taskGroup": {
"channel": 5
},
"trace": {
"enable": "false"
}
},
"statistics": {
"collector": {
"plugin": {
"taskClass": "com.alibaba.datax.core.statistics.plugin.task.StdoutPluginCollector",
"maxDirtyNumber": 10
}
}
}
}
}
打开cmd,输进入bin目录下:
cd E:\datax\bin
然后输入自检脚本:
python datax.py "E:\datax\conf\core.json"
若输出结果为:
则配置成功!!!!!!