[原创]KETTLE JAVA API 实战记录

    最近有个项目需要使用Kettle,但是,经过多方查找,却发现相关的资料和介绍极少,于是根据kettle的文档,费尽周折,终于完成了这篇《KETTLE JAVA API 实战记录》,为避免更多的人走弯路,现在发布出来,供大家参考。

KETTLE JAVA API 实战记录

陈海青(www.chq.name

2006.07.20

版权所有:陈海青(joson chen) www.chq.name
Copyright (C)  陈海青(joson chen) www.chq.name All Rights Reserved
转载或引用本网版权所有之内容须注明“转自(或引自)chq.name (Junit fans)”字样,

并标明本网站网址http://www.chq.name

 

前言:    最近有个项目需要使用Kettle,但是,经过多方查找,却发现相关的资料和介绍极少,于是根据kettle的文档,费尽周折,终于完成了这篇《KETTLE JAVA API 实战记录》,为避免更多的人走弯路,现在发布出来,供大家参考。

为什么要用KettleKETTLE JAVA API

Kettle是什么?kettle是一个开源ETL工具。kettle提供了基于java的图形化界面使用很方便,kettleETL工具集合也比较多,常用的ETL工具都包含了。

为什么使用KETTLE JAVA API:就像kettle文档所说:KETTLE JAVA API : Program your own Kettle transformationkettle提供了基于JAVA的脚步编写功能可以灵活地自定义ETL过程,使自行定制、批量处理等成为可能,这才是一个程序员需要做的工作,而不仅是象使用word一样操作kettle用户界面。

KETTLE JAVA API 实战操作记录:

一、         搭建环境 :http://www.kettle.be网站下载kettle的源码包,加压缩,例如解压缩到d:kettle目录

二、         打开eclipse,新建一个项目,要使用jdk1.5.0,因为kettle的要使用System.getenv(),只有在jdk1.5.0才被支持。提起getenv(),好像有一段几起几落的记录,曾一度被抛弃,现在又被jdk1.5支持了.

三、         建一个class : TransBuilder.java,可以把d:kettle extraTransBuilder.java的内容原样拷贝到你的TransBuilder.java里。

四、         根据需要编辑源码。并需要对原程序进行如下修改,在头部增加:

import org.eclipse.swt.dnd.Transfer;

//这个包被遗漏了,原始位置kettle根目录libswtwin32swt.jar

//add by chq(www.chq.name) on  2006.07.20

(后来发现,不必加这个引用,因为编译时不需要)

五、         编译准备,在eclipse中增加jar包,主要包括(主要依据extraTransBuilder.bat):

libkettle.jar
libextCacheDB.jar
libextSQLBaseJDBC.jar
libextactivation.jar
libextdb2jcc.jar
libextdb2jcc_license_c.jar
libextedtftpj-1.4.5.jar
libextfirebirdsql-full.jar
libextfirebirdsql.jar
libextgis-shape.jar
libexthsqldb.jar
libextifxjdbc.jar
libextjavadbf.jar
libextjconn2.jar
libextjs.jar
libextjt400.jar
libextjtds-1.1.jar
libextjxl.jar
libextktable.jar
libextlog4j-1.2.8.jar
libextmail.jar
libextmysql-connector-java-3.1.7-bin.jar
libextojdbc14.jar
libextorai18n.jar
libextpg74.215.jdbc3.jar
libextedbc.jar

(
注意 :下面这个包被遗漏了,要加上。原始位置kettle根目录libswtwin32swt.jar)
libswtwin32swt.jar

六、         编译成功后,准备运行

为使程序不必登陆就可以运行,需要设置环境署文件:kettle.properties,位置在用户目录里,一般在 Documents and Settings用户.kettle,主要内容如下:

KETTLE_REPOSITORY=kettle@m80

KETTLE_USER=admin

KETTLE_PASSWORD=admin

七、         好了,现在可以运行一下了,看看数据是不是已经拷贝到目标表了。

以下是运行时的控制台信息输出:

Image

下面是自动生成的Transformation :

Image

以下为修改后的程序源码:


package name.chq.test;

 

import java.io.DataOutputStream;

import java.io.File;

import java.io.FileOutputStream;

 

import be.ibridge.kettle.core.Const;

import be.ibridge.kettle.core.LogWriter;

import be.ibridge.kettle.core.NotePadMeta;

import be.ibridge.kettle.core.database.Database;

import be.ibridge.kettle.core.database.DatabaseMeta;

import be.ibridge.kettle.core.exception.KettleException;

import be.ibridge.kettle.core.util.EnvUtil;

import be.ibridge.kettle.trans.StepLoader;

import be.ibridge.kettle.trans.Trans;

import be.ibridge.kettle.trans.TransHopMeta;

import be.ibridge.kettle.trans.TransMeta;

import be.ibridge.kettle.trans.step.StepMeta;

import be.ibridge.kettle.trans.step.StepMetaInterface;

import be.ibridge.kettle.trans.step.selectvalues.SelectValuesMeta;

import be.ibridge.kettle.trans.step.tableinput.TableInputMeta;

import be.ibridge.kettle.trans.step.tableoutput.TableOutputMeta;

 

 

//这个包被遗漏了,原始位置kettle根目录libswtwin32swt.jar

//add by chq(www.chq.name) on  2006.07.20

//import org.eclipse.swt.dnd.Transfer;

 

/**

 * Class created to demonstrate the creation of transformations on-the-fly.

 *

 * @author Matt

 *

 */

public class TransBuilder

{

    public static final String[] databasesXML = {

        "<?xml version="1.0" encoding="UTF-8"?>" +

        "<connection>" +

          "<name>target</name>" +

          "<server>192.168.17.35</server>" +

          "<type>ORACLE</type>" +

                     "<access>Native</access>" +

                     "<database>test1</database>" +

                     "<port>1521</port>" +

                     "<username>testuser</username>" +

                     "<password>pwd</password>" +

                     "<servername/>" +

                     "<data_tablespace/>" +

                     "<index_tablespace/>" +

                     "<attributes>" +

                       "<attribute><code>EXTRA_OPTION_MYSQL.defaultFetchSize</code><attribute>500</attribute></attribute>" +

                       "<attribute><code>EXTRA_OPTION_MYSQL.useCursorFetch</code><attribute>true</attribute></attribute>" +

                          "<attribute><code>PORT_NUMBER</code><attribute>1521</attribute></attribute>" +

                            "</attributes>" +

                       "</connection>" ,

        

        "<?xml version="1.0" encoding="UTF-8"?>" +

                         "<connection>" +

                                "<name>source</name>" +

                                "<server>192.168.16.12</server>" +

                                "<type>ORACLE</type>" +

                                "<access>Native</access>" +

                                "<database>test2</database>" +

                                "<port>1521</port>" +

                                "<username>testuser</username>" +

                                "<password>pwd2</password>" +

                                "<servername/>" +

                                "<data_tablespace/>" +

                                "<index_tablespace/>" +

                                "<attributes>" +

                                    "<attribute><code>EXTRA_OPTION_MYSQL.defaultFetchSize</code><attribute>500</attribute></attribute>" +

                                    "<attribute><code>EXTRA_OPTION_MYSQL.useCursorFetch</code><attribute>true</attribute></attribute>" +

                                       "<attribute><code>PORT_NUMBER</code><attribute>1521</attribute></attribute>" +

                                "</attributes>" +

                         "</connection>"

    };

 

    /**

     * Creates a new Transformation using input parameters such as the tablename to read from.

     * @param transformationName The name of the transformation

     * @param sourceDatabaseName The name of the database to read from

     * @param sourceTableName The name of the table to read from

     * @param sourceFields The field names we want to read from the source table

     * @param targetDatabaseName The name of the target database

     * @param targetTableName The name of the target table we want to write to

     * @param targetFields The names of the fields in the target table (same number of fields as sourceFields)

     * @return A new transformation

     * @throws KettleException In the rare case something goes wrong

     */

    public static final TransMeta buildCopyTable(

       String transformationName,String sourceDatabaseName, String sourceTableName,

       String[] sourceFields, String targetDatabaseName, String targetTableName,

       String[] targetFields)

      throws KettleException

    {

        LogWriter log = LogWriter.getInstance();

        EnvUtil.environmentInit();

        try

        {

            //

            // Create a new transformation...

            //

            TransMeta transMeta = new TransMeta();

            transMeta.setName(transformationName);

           

            // Add the database connections

 

            for (int i=0;i<databasesXML.length;i++)

            {

                DatabaseMeta databaseMeta = new DatabaseMeta(databasesXML[i]);

                transMeta.addDatabase(databaseMeta);

            }

           

            DatabaseMeta sourceDBInfo = transMeta.findDatabase(sourceDatabaseName);

            DatabaseMeta targetDBInfo = transMeta.findDatabase(targetDatabaseName);

 

           

            //

            // Add a note

            //

            String note = "Reads information from table [" + sourceTableName+ "] on database ["

                            + sourceDBInfo + "]" + Const.CR;

            note += "After that, it writes the information to table [" + targetTableName + "] on database ["

                            + targetDBInfo + "]";

            NotePadMeta ni = new NotePadMeta(note, 150, 10, -1, -1);

            transMeta.addNote(ni);

 

            //

            // create the source step...

            //

            String fromstepname = "read from [" + sourceTableName + "]";

            TableInputMeta tii = new TableInputMeta();

            tii.setDatabaseMeta(sourceDBInfo);

            String selectSQL = "SELECT "+Const.CR;

            for (int i=0;i<sourceFields.length;i++)

            {

            /* modi by chq(www.chq.name): use * to replace the fields,经分析,以下语句可以处理‘* */

                if (i>0)

                     selectSQL+=", ";

                else selectSQL+="  ";

              

                selectSQL+=sourceFields[i]+Const.CR;

            }

            selectSQL+="FROM "+sourceTableName;

            tii.setSQL(selectSQL);

 

            StepLoader steploader = StepLoader.getInstance();

 

            String fromstepid = steploader.getStepPluginID(tii);

            StepMeta fromstep = new StepMeta(log, fromstepid, fromstepname, (StepMetaInterface) tii);

            fromstep.setLocation(150, 100);

            fromstep.setDraw(true);

            fromstep.setDescription("Reads information from table [" + sourceTableName

                                     + "] on database [" + sourceDBInfo + "]");

            transMeta.addStep(fromstep);

 

            //

            // add logic to rename fields

            // Use metadata logic in SelectValues, use SelectValueInfo...

            //

            /* 不必改名或映射 add by chq(www.chq.name) on 2006.07.20

            SelectValuesMeta svi = new SelectValuesMeta();

            svi.allocate(0, 0, sourceFields.length);

            for (int i = 0; i < sourceFields.length; i++)

            {

                svi.getMetaName()[i] = sourceFields[i];

                svi.getMetaRename()[i] = targetFields[i];

            }

 

            String selstepname = "Rename field names";

            String selstepid = steploader.getStepPluginID(svi);

            StepMeta selstep = new StepMeta(log, selstepid, selstepname, (StepMetaInterface) svi);

            selstep.setLocation(350, 100);

            selstep.setDraw(true);

            selstep.setDescription("Rename field names");

            transMeta.addStep(selstep);

 

            TransHopMeta shi = new TransHopMeta(fromstep, selstep);

            transMeta.addTransHop(shi);

            fromstep = selstep; //设定了新的起点 by chq(www.chq.name) on 2006.07.20

            */

            //

            // Create the target step...

            //

            //

            // Add the TableOutputMeta step...

            //

            String tostepname = "write to [" + targetTableName + "]";

            TableOutputMeta toi = new TableOutputMeta();

            toi.setDatabase(targetDBInfo);

            toi.setTablename(targetTableName);

            toi.setCommitSize(200);

            toi.setTruncateTable(true);

 

            String tostepid = steploader.getStepPluginID(toi);

            StepMeta tostep = new StepMeta(log, tostepid, tostepname, (StepMetaInterface) toi);

            tostep.setLocation(550, 100);

            tostep.setDraw(true);

            tostep.setDescription("Write information to table [" + targetTableName + "] on database [" + targetDBInfo + "]");

            transMeta.addStep(tostep);

 

            //

            // Add a hop between the two steps...

            //

            TransHopMeta hi = new TransHopMeta(fromstep, tostep);

            transMeta.addTransHop(hi);

 

            // OK, if we're still here: overwrite the current transformation...

            return transMeta;

        }

        catch (Exception e)

        {

            throw new KettleException("An unexpected error occurred creating the new transformation", e);

        }

    }

 

    /**

     * 1) create a new transformation

     * 2) save the transformation as XML file

     * 3) generate the SQL for the target table

     * 4) Execute the transformation

     * 5) drop the target table to make this program repeatable

     *

     * @param args

     */

    public static void main(String[] args) throws Exception

    {

       EnvUtil.environmentInit();

        // Init the logging...

        LogWriter log = LogWriter.getInstance("TransBuilder.log", true, LogWriter.LOG_LEVEL_DETAILED);

       

        // Load the Kettle steps & plugins

        StepLoader stloader = StepLoader.getInstance();

        if (!stloader.read())

        {

            log.logError("TransBuilder",  "Error loading Kettle steps & plugins... stopping now!");

            return;

        }

       

        // The parameters we want, optionally this can be

        String fileName = "NewTrans.xml";

        String transformationName = "Test Transformation";

        String sourceDatabaseName = "source";

        String sourceTableName = "testuser.source_table";

        String sourceFields[] = {

               "*"

               };

 

        String targetDatabaseName = "target";

        String targetTableName = "testuser.target_table";

        String targetFields[] = {

               "*"

               };

 

       

        // Generate the transformation.

        TransMeta transMeta = TransBuilder.buildCopyTable(

                transformationName,

                sourceDatabaseName,

                sourceTableName,

                sourceFields,

                targetDatabaseName,

                targetTableName,

                targetFields

                );

       

        // Save it as a file:

        String xml = transMeta.getXML();

        DataOutputStream dos = new DataOutputStream(new FileOutputStream(new File(fileName)));

        dos.write(xml.getBytes("UTF-8"));

        dos.close();

        System.out.println("Saved transformation to file: "+fileName);

 

        // OK, What's the SQL we need to execute to generate the target table?

        String sql = transMeta.getSQLStatementsString();

       

        // Execute the SQL on the target table:

        Database targetDatabase = new Database(transMeta.findDatabase(targetDatabaseName));

        targetDatabase.connect();

        targetDatabase.execStatements(sql);

       

        // Now execute the transformation...

        Trans trans = new Trans(log, transMeta);

        trans.execute(null);

        trans.waitUntilFinished();

       

        // For testing/repeatability, we drop the target table again

        /* modi by chq(www.chq.name) on  2006.07.20 不必删表

        //targetDatabase.execStatement("drop table "+targetTableName);

        targetDatabase.disconnect();

    }

 

 

}

 

 

相关文档:

 

[翻译]KETTLE JAVA API :编程定制自己的Kettle转换(transformation)

 

 

HOW TO use KETTLE JAVA API (English)

Joson chen(http://www.chq.name

2006.07.20

step1 :down load ,and unzip the kettle’s src zip from http://www.kettle.be.

step2 :use eclipse ,new a project ,should use jdk1.5.0,as kettle use the System.getenv(),which support by jdk1.5.0 only.

Step 3 :import the TransBuilder.java into project,which in kettle extraTransBuilder.java.

Step 4 :add jar into project,as follow(begin with the kettle’s home) :

  • libkettle.jar
  • libextCacheDB.jar
  • libextSQLBaseJDBC.jar
  • libextactivation.jar
  • libextdb2jcc.jar
  • libextdb2jcc_license_c.jar
  • libextedtftpj-1.4.5.jar
  • libextfirebirdsql-full.jar
  • libextfirebirdsql.jar
  • libextgis-shape.jar
  • libexthsqldb.jar
  • libextifxjdbc.jar
  • libextjavadbf.jar
  • libextjconn2.jar
  • libextjs.jar
  • libextjt400.jar
  • libextjtds-1.1.jar
  • libextjxl.jar
  • libextktable.jar
  • libextlog4j-1.2.8.jar
  • libextmail.jar
  • libextmysql-connector-java-3.1.7-bin.jar
  • libextojdbc14.jar
  • libextorai18n.jar
  • libextpg74.215.jdbc3.jar
  • libextedbc.jar
  • libswtwin32swt.jar
    • (notice :shoule add the libswtwin32swt.jar,as above)

step 5,compile the project

step 6 :set the user ‘s env(on windows) :

edit a file named :kettle.properties,place to user’s home ,commonly Documents and Settingsuser.kettle,contents(for examples):

  • KETTLE_REPOSITORY=kettle@test
  • KETTLE_USER=admin
  • KETTLE_PASSWORD=admin

Step 7 :ok ,now all get ready,Run it , and check you target table now, follow is the output  . --

 

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值