datax是一款python实现的数据库迁移工具。
基本原理:
通过datax.py读取json配置文件,按照配置执行迁移任务。
步骤为:
1、下载并解压datax。
2、编写迁移配置文件(如:t_user.json)。
3、执行命令。
/data/datax/bin/datax.py /data/datax/job/crm/t_user.json
t_user.json
示例json如下:
{
"job": {
"setting": {
"speed": {
"channel": 3,
"byte": 1048576
},
"errorLimit": {
"record": 0,
"percentage": 0.02
}
},
"content": [
{
"reader": {
"name": "oraclereader",
"parameter": {
"username": "admin",
"password": "1234",
"where": "",
"connection": [
{
"querySql": [
"select USER_ACCOUNT,USERNAME,AGE from t_user where age < 50"
],,
"jdbcUrl": ["jdbc:oracle:thin:@192.168.0.1/crm"]
}
]
}
},
"writer": {
"name": "oraclewriter",
"parameter": {
"username": "admintwo",
"password": "1234",
"column":["USER_ACCOUNT","USERNAME","AGE"],
"connection": [
{
"jdbcUrl": "jdbc:oracle:thin:@192.168.0.2/crm_2",
"table": ["t_user_new"]
}
]
}
}
}
]
}
}
字段说明:
speed.channel 线程数
speed.byte 缓存大小(一般用默认值就行)
errorLimit.record 容错条数,0表示一条错的都不能有
errorLimit.percentage 容错百分比