Streamsets自定义Greenplum（源端）组件开发

iT_B_OY

已于 2022-02-26 13:49:03 修改

阅读量757

点赞数

分类专栏： streamsets 文章标签： java 其他

于 2022-02-26 12:02:02 首次发布

本文链接：https://blog.csdn.net/iT_B_OY/article/details/123147685

版权

streamsets 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

streamset支持的关系型数据库中没有Greenplum

本次开发为其加入数据源和目标源为Greenplum库

1.首先生产自定义开发模板

mvn archetype:generate -DarchetypeGroupId=com.streamsets \
-DarchetypeArtifactId=streamsets-datacollector-stage-lib-tutorial \
-DarchetypeVersion=3.13.0 -DinteractiveMode=true

前提本地有编译过的streamsets源码，否则会报错，我这里用的3.13.0的源码编译

2.下载后打开发现代码如下（我这里类名做了修改）

DTarget主要为页面显示的没有字母D的主要为代码实现类

3.源端代码块：GpDSource（主要是produce方法）

@Override
  public String produce(String lastSourceOffset, int maxBatchSize, BatchMaker batchMaker) throws StageException {
    System.out.println("come into produce");
    String urlConfig=getConfig();
    String sql=getSqlQuery();
    String username= getUserName();
    String password= getPassWord();
    long nextSourceOffset=0;
    if (lastSourceOffset != null) {
      nextSourceOffset = Long.parseLong(lastSourceOffset);
    }
    int numRecords = 0;

    try {
      Class.forName("com.pivotal.jdbc.GreenplumDriver");
      con1 = DriverManager.getConnection(urlConfig,username,password);
    } catch (ClassNotFoundException | SQLException e1) {
      System.out.println("com.pivotal.jdbc.GreenplumDriver not found");
      e1.printStackTrace();
    }


    try {
     stmt=con1.prepareStatement(sql);
     System.out.println(sql);
     rset=stmt.executeQuery();
    } catch (SQLException e) {
      e.printStackTrace();
    }

    List<Map<String, Object>> infos = new ArrayList<>();
    ResultSetMetaData rsmd = null;
    int columnCount = 0;
    try {
      rsmd = rset.getMetaData();
      columnCount = rsmd.getColumnCount();
    } catch (SQLException e) {
      e.printStackTrace();
    }
    try{
    while (rset.next()) {
      Record record = getContext().createRecord("some-id::" + nextSourceOffset);
      Map<String, Field> map = new HashMap<>();
      for(int i = 1; i<=columnCount; i++) {
        String columnName = null;
        String colValue = null;

          columnName = rsmd.getColumnLabel(i);
          colValue = rset.getString(i);
        System.out.println(columnName+"---"+colValue);
        map.put(columnName, Field.create(colValue));
      }
      record.set(Field.create(map));
      batchMaker.addRecord(record);
      ++nextSourceOffset;
      ++numRecords;
      ++noMoreDataRecordCount;
      if (!getIsIncrementalMode()) {
        generateNoMoreDataEvent();
      }
    }
    }
    catch (Exception e){
      e.printStackTrace();
    }
    try {
      rset.close();
      stmt.close();
      con1.close();
    } catch (SQLException e) {
      e.printStackTrace();
    }
    return String.valueOf(nextSourceOffset);
  }

4.修改目标源DTarget类（主要贴了write方法，此方法决定了你如何写入库中的逻辑）

public void write(Batch batch) throws StageException {
    Iterator<Record> batchIterator = batch.getRecords();

    if (isBatch()) {
      List<JdbcUtil.Sql> sqls = new ArrayList<>();
      System.out.println("code:"+getJDBCOperationType().getCode());
      if(getJDBCOperationType().getCode()==1){
      while (batchIterator.hasNext()) {
          sqls.add(getInsertSql(batchIterator.next()));
      }}
      if(getJDBCOperationType().getCode()==3){
        while (batchIterator.hasNext()) {
          sqls.add(getUpdateSql(batchIterator.next()));
        }}
      if(getJDBCOperationType().getCode()==2){
        while (batchIterator.hasNext()) {
          sqls.add(getDeleteSql(batchIterator.next()));
        }}
      try {
        JdbcUtil.execute(sqls);
      } catch (SQLException e) {
        error(e, null);
      }
    } else {
      if(getJDBCOperationType().getCode()==1){
      while (batchIterator.hasNext()) {
        Record record = batchIterator.next();
        try {
          JdbcUtil.execute(getInsertSql(record));
        } catch (SQLException e) {
          error(e, record);
        }
      }
      }
      if(getJDBCOperationType().getCode()==3){
        while (batchIterator.hasNext()) {
          Record record = batchIterator.next();
          try {
            JdbcUtil.execute(getUpdateSql(record));
          } catch (SQLException e) {
            error(e, record);
          }
        }
      }
      if(getJDBCOperationType().getCode()==2){
        while (batchIterator.hasNext()) {
          Record record = batchIterator.next();
          try {
            JdbcUtil.execute(getDeleteSql(record));
          } catch (SQLException e) {
            error(e, record);
          }
        }
      }
    }
  }

5.然后你就可以编译mvn clean install -DskipTests你的代码

6.将target目录下的压缩包（注意压缩包，不是jar包）解压到你的streamsets目录下的usr-lib下即可

7.然后注意因为你改了源码，所以你要给streamsets目录下的 etc/sdc-security.policy加入以下代码（注意第一行的file路径要对应）

grant codebase "file://${sdc.dist.dir}/user-libs/samplestage/-" {
  permission java.io.FilePermission "<<ALL FILES>>", "read,execute";
  permission java.lang.RuntimePermission "getenv.*";
};

8.启动streamsets即可

iT_B_OY

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Streamsets自定义Greenplum（源端）组件开发

streamset支持的关系型数据库中没有Greenplum本次开发为其加入数据源和目标源为Greenplum库1.首先生产自定义开发模板mvn archetype:generate -DarchetypeGroupId=com.streamsets \-DarchetypeArtifactId=streamsets-datacollector-stage-lib-tutorial \-DarchetypeVersion=3.13.0 -DinteractiveMode=tr...
复制链接

扫一扫