streamset支持的关系型数据库中没有Greenplum
本次开发为其加入数据源和目标源为Greenplum库
1.首先生产自定义开发模板
mvn archetype:generate -DarchetypeGroupId=com.streamsets \
-DarchetypeArtifactId=streamsets-datacollector-stage-lib-tutorial \
-DarchetypeVersion=3.13.0 -DinteractiveMode=true
前提本地有编译过的streamsets源码,否则会报错,我这里用的3.13.0的源码编译
2.下载后打开发现代码如下(我这里类名做了修改)
.
DTarget主要为页面显示的 没有字母D的主要为代码实现类
3.源端代码块:GpDSource(主要是produce方法)
@Override
public String produce(String lastSourceOffset, int maxBatchSize, BatchMaker batchMaker) throws StageException {
System.out.println("come into produce");
String urlConfig=getConfig();
String sql=getSqlQuery();
String username= getUserName();
String password= getPassWord();
long nextSourceOffset=0;
if (lastSourceOffset != null) {
nextSourceOffset = Long.parseLong(lastSourceOffset);
}
int numRecords = 0;
try {
Class.forName("com.pivotal.jdbc.GreenplumDriver");
con1 = DriverManager.getConnection(urlConfig,username,password);
} catch (ClassNotFoundException | SQLException e1) {
System.out.println("com.pivotal.jdbc.GreenplumDriver not found");
e1.printStackTrace();
}
try {
stmt=con1.prepareStatement(sql);
System.out.println(sql);
rset=stmt.executeQuery();
} catch (SQLException e) {
e.printStackTrace();
}
List<Map<String, Object>> infos = new ArrayList<>();
ResultSetMetaData rsmd = null;
int columnCount = 0;
try {
rsmd = rset.getMetaData();
columnCount = rsmd.getColumnCount();
} catch (SQLException e) {
e.printStackTrace();
}
try{
while (rset.next()) {
Record record = getContext().createRecord("some-id::" + nextSourceOffset);
Map<String, Field> map = new HashMap<>();
for(int i = 1; i<=columnCount; i++) {
String columnName = null;
String colValue = null;
columnName = rsmd.getColumnLabel(i);
colValue = rset.getString(i);
System.out.println(columnName+"---"+colValue);
map.put(columnName, Field.create(colValue));
}
record.set(Field.create(map));
batchMaker.addRecord(record);
++nextSourceOffset;
++numRecords;
++noMoreDataRecordCount;
if (!getIsIncrementalMode()) {
generateNoMoreDataEvent();
}
}
}
catch (Exception e){
e.printStackTrace();
}
try {
rset.close();
stmt.close();
con1.close();
} catch (SQLException e) {
e.printStackTrace();
}
return String.valueOf(nextSourceOffset);
}
4.修改目标源DTarget类(主要贴了write方法,此方法决定了你如何写入库中的逻辑)
public void write(Batch batch) throws StageException {
Iterator<Record> batchIterator = batch.getRecords();
if (isBatch()) {
List<JdbcUtil.Sql> sqls = new ArrayList<>();
System.out.println("code:"+getJDBCOperationType().getCode());
if(getJDBCOperationType().getCode()==1){
while (batchIterator.hasNext()) {
sqls.add(getInsertSql(batchIterator.next()));
}}
if(getJDBCOperationType().getCode()==3){
while (batchIterator.hasNext()) {
sqls.add(getUpdateSql(batchIterator.next()));
}}
if(getJDBCOperationType().getCode()==2){
while (batchIterator.hasNext()) {
sqls.add(getDeleteSql(batchIterator.next()));
}}
try {
JdbcUtil.execute(sqls);
} catch (SQLException e) {
error(e, null);
}
} else {
if(getJDBCOperationType().getCode()==1){
while (batchIterator.hasNext()) {
Record record = batchIterator.next();
try {
JdbcUtil.execute(getInsertSql(record));
} catch (SQLException e) {
error(e, record);
}
}
}
if(getJDBCOperationType().getCode()==3){
while (batchIterator.hasNext()) {
Record record = batchIterator.next();
try {
JdbcUtil.execute(getUpdateSql(record));
} catch (SQLException e) {
error(e, record);
}
}
}
if(getJDBCOperationType().getCode()==2){
while (batchIterator.hasNext()) {
Record record = batchIterator.next();
try {
JdbcUtil.execute(getDeleteSql(record));
} catch (SQLException e) {
error(e, record);
}
}
}
}
}
5.然后你就可以编译mvn clean install -DskipTests你的代码
6.将target目录下的压缩包(注意压缩包,不是jar包)解压到你的streamsets目录下的usr-lib下 即可
7.然后注意因为你改了源码,所以你要给streamsets目录下的 etc/sdc-security.policy加入以下代码(注意第一行的file路径要对应)
grant codebase "file://${sdc.dist.dir}/user-libs/samplestage/-" {
permission java.io.FilePermission "<<ALL FILES>>", "read,execute";
permission java.lang.RuntimePermission "getenv.*";
};
8.启动streamsets即可