一、前言
地址:Kettle中文网
Kettle是一款国外开源的ETL工具,纯java编写,可以在Window、Linux、Unix上运行,绿色无需安装,数据抽取高效稳定。
Kettle 中文名称叫水壶,该项目的主程序员MATT 希望把各种数据放到一个壶里,然后以一种指定的格式流出。
Kettle这个ETL工具集,它允许你管理来自不同数据库的数据,通过提供一个图形化的用户环境来描述你想做什么,而不是你想怎么做。
Kettle中有两种脚本文件,transformation和job,transformation完成针对数据的基础转换,job则完成整个工作流的控制。
Kettle(现在已经更名为PDI,Pentaho Data Integration-Pentaho数据集成)。
二、环境
1.JDK1.8
2.Kettle8.3
三、项目结构
3.1 修改pom文件
说明:kettle用到的部分包依赖(可从kettle客户端lib包拷贝,可自行打到自己maven仓库,因是测试案例本人是本地引用)
<!-- Spring boot 父引用-->
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.1.6.RELEASE</version>
</parent>
<!-- Spring boot 核心web 整合了spring +springMVC + mybatis-->
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>com.jcraft</groupId>
<artifactId>jsch</artifactId>
<version>0.1.46</version>
</dependency>
<dependency>
<groupId>com.google.common</groupId>
<artifactId>guava</artifactId>
<version>17.0</version>
<scope>system</scope>
<systemPath>${project.basedir}/lib/guava-17.0.jar</systemPath>
</dependency>
<dependency>
<groupId>commons-lang</groupId>
<artifactId>commons-lang</artifactId>
<version>2.6</version>
</dependency>
<dependency>
<groupId>commons-codec</groupId>
<artifactId>commons-codec</artifactId>
<version>1.10</version>
<scope>system</scope>
<systemPath>${project.basedir}/lib/commons-codec-1.10.jar</systemPath>
</dependency>
<dependency>
<groupId>org.pentaho.di</groupId>
<artifactId>kettle-core</artifactId>
<version>8.3.0.0-371</version>
<scope>system</scope>
<systemPath>${project.basedir}/lib/kettle-core-8.3.0.0-371.jar</systemPath>
</dependency>
<dependency>
<groupId>org.pentaho.di</groupId>
<artifactId>kettle-engine</artifactId>
<version>8.3.0.0-371</version>
<scope>system</scope>
<systemPath>${project.basedir}/lib/kettle-engine-8.3.0.0-371.jar</systemPath>
</dependency>
<dependency>
<groupId>org.pentaho.metastore</groupId>
<artifactId>metastore</artifactId>
<version>8.3.0.0-371</version>
<scope>system</scope>
<systemPath>${project.basedir}/lib/metastore-8.3.0.0-371.jar</systemPath>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-vfs2</artifactId>
<version>2.2</version>
<scope>system</scope>
<systemPath>${project.basedir}/lib/commons-vfs2-2.2.jar</systemPath>
</dependency>
<dependency>
<groupId>net.sourceforge</groupId>
<artifactId>jtds</artifactId>
<version>1.3.3</version>
<scope>system</scope>
<systemPath>${project.basedir}/lib/jcifs-1.3.3.jar</systemPath>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-logging</artifactId>
<version>1.1.3</version>
<scope>system</scope>
<systemPath>${project.basedir}/lib/commons-logging-1.1.3.jar</systemPath>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-io</artifactId>
<version>2.2</version>
<scope>system</scope>
<systemPath>${project.basedir}/lib/commons-io-2.2.jar</systemPath>
</dependency>
<dependency>
<groupId>oracle</groupId>
<artifactId>ojdbc8</artifactId>
<version>8.0</version>
<scope>system</scope>
<systemPath>${project.basedir}/lib/ojdbc8.jar</systemPath>
</dependency>
</dependencies>
3.2 Kettle工具类
public class KettleUtils {
/**
* 执行ktr文件
* @param filename
* @param params
* @return
*/
public static void runKtr(String filename, Map<String, String> params, String dirPath) {
try {
KettleEnvironment.init();
TransMeta tm = new TransMeta(dirPath + File.separator + filename);
Trans trans = new Trans(tm);
if (params != null) {
Iterator<Map.Entry<String, String>> entries = params.entrySet().iterator();
while (entries.hasNext()) {
Map.Entry<String, String> entry = entries.next();
trans.setParameterValue(entry.getKey(),entry.getValue());
}
}
trans.execute(null);
trans.waitUntilFinished();
} catch (Exception e) {
e.printStackTrace();
}
}
/**
* 执行kjb文件
* @param filename
* @param params
* @return
*/
public static void runKjb(String filename, Map<String, String> params, String dirPath){
try {
KettleEnvironment.init();
JobMeta jm = new JobMeta(dirPath + File.separator + filename, null);
Job job = new Job(null, jm);
if (params != null) {
Iterator<Map.Entry<String, String>> entries = params.entrySet().iterator();
while (entries.hasNext()) {
Map.Entry<String, String> entry = entries.next();
job.setVariable(entry.getKey(), entry.getValue());
}
}
job.start();
job.waitUntilFinished();
} catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String[] args) throws Exception {
ClassPathResource classPathResource = new ClassPathResource("kettle");
System.out.println("classPathResource:" + classPathResource.getFile().getPath());
runKjb("test_kettle.kjb", null, classPathResource.getFile().getPath());
}
四、Kettle脚本
说明:转换ktr脚本和作业kjb脚本由可视化kettle先编写好
五、测试
说明:工具类简单测试