准备
-
需要依赖(我的把源码下来,本地install,远程仓库是否有这两个依赖不确定)
<dependency>
<groupId>com.alibaba.datax</groupId>
<artifactId>datax-core</artifactId>
<version>0.0.1-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>com.alibaba.datax</groupId>
<artifactId>datax-common</artifactId>
<version>0.0.1-SNAPSHOT</version>
<exclusions>
<exclusion>
<artifactId>slf4j-log4j12</artifactId>
<groupId>org.slf4j</groupId>
</exclusion>
</exclusions>
</dependency>
自定义Transformer 及参数对应配置文件关系
- 创建类继承Transformer或ComplexTransformer并实现覆盖evaluate方法
- 配置文件里面的配置
- 执行任务打印的数据
添加配置
- resources下面添加transformer.json(必选)
{
"name": "hiding_transformer",
"class": "com.alibaba.datax.transformer.HidingTransformer",
"description": "Hiding transformer setting any value as default",
"developer": "build2last@github.com"
}
- resources下面添加plugin_job_template.json,可以添加到配置json的transformer的数组(可选)
{
"name": "hiding_transformer",
"parameter": {
"columnIndex": 1,
"paras": [
"param1",
"param2"
]
}
}
- name:transformer名称,不能以dx_开头
- class:继承类名,加载jar包,通过反射获取类对象,将transformer注册
maven插件
<build>
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<finalName>datax</finalName>
<descriptors>
<descriptor>package.xml</descriptor>
</descriptors>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.3.2</version>
<configuration>
<source>1.6</source>
<target>1.6</target>
<encoding>${project-sourceEncoding}</encoding>
</configuration>
</plugin>
</plugins>
</build>
package.xml
<assembly
xmlns="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.0 http://maven.apache.org/xsd/assembly-1.1.0.xsd">
<id></id>
<formats>
<format>dir</format>
</formats>
<includeBaseDirectory>false</includeBaseDirectory>
<fileSets>
<fileSet>
<directory>src/main/resources</directory>
<includes>
<include>transformer.json</include>
<include>plugin_job_template.json</include>
</includes>
<outputDirectory>plugin/transformer/hiding_transformer</outputDirectory>
</fileSet>
<fileSet>
<directory>target/</directory>
<includes>
<include>hidingtransformer-1.0-SNAPSHOT.jar</include>
</includes>
<outputDirectory>plugin/transformer/hiding_transformer</outputDirectory>
</fileSet>
</fileSets>
<dependencySets>
<dependencySet>
<useProjectArtifact>false</useProjectArtifact>
<outputDirectory>plugin/transformer/hiding_transformer/libs</outputDirectory>
<scope>runtime</scope>
</dependencySet>
</dependencySets>
</assembly>
打包
mvn clean package -DskipTests assembly:assembly
datax添加本地transformer
常见问题
- window10 执行要用Python2执行datax.py
- cmd如果乱码,执行前先改cmd为utf-8运行chcp 65001
-
如果是外网使用mongodbreader这个插件,要自己下载源码,修改连接方式然后重新打包,放到datax/plugin/reader下面
- 如果是要写入elasticsearch,要自己下载源码编译插件,然后放到datax/plugin/writer下面
- elasticsearch不支持读取