业务背景:
金融机构都会存在放贷场景,预期还款情况,真实还款情况,目前可借余额情况通常采用离线同步方式查询统计获得,用户想要查看实时数据就通常采用实时大屏的方式来解决,算是一类简单的通用场景。
借还款记录表
客户主表放到oracle中,可以通过Flink CDC方式同步到kafka中,kafka中对应的数据字段如下:
CREATE TABLE KafkaTable (
`jiaoyije` BIGINT, -- 交易金额 oracle:NUMBER(17) example: 210250
`jiaoyils` STRING, -- 交易流水 oracle:VARCHAR2(32) example: PL19122100000718
`jiaoyima` STRING, -- 交易码 oracle:VARCHAR2(10) example: ln05 '3020' 放款 和非3020 还款
`yeziduan` STRING, -- 余额字段 oracle:VARCHAR2(8) example: YSYJLIXI
`jiaoyirq` STRING, -- 交易日期 oracle:VARCHAR2(8) example: 20191221
`SHIJCHUO` STRING, -- 时间戳 oracle:NUMBER(19) example: 1576939296809
WATERMARK FOR `SHIJCHUO` AS `SHIJCHUO` - INTERVAL '30' MINUTES
) WITH (
'connector' = 'kafka',
'topic' = 'klnl_dkzhmx',
'properties.bootstrap.servers' = '10.125.221.118:9092',
'properties.group.id' = '20221019',
'scan.startup.mode' = 'earliest-offset',
'format' = 'canal-json'
)
借还款余额表
CREATE TABLE mysqlTable (
ACTUALDAY BIGINT, -- 实际日期 20221021
LENDINGVALUE Double, -- 放款值 Double
LOANVALUE Double, -- 贷款值 Double
BALANCEVALUE Double, -- 余额值 Double
PRIMARY KEY (ACTUALDAY) NOT ENFORCED
) WITH (
'connector' = 'jdbc',
'url' = 'jdbc:mysql://localhost:3306/mydatabase',
'table-name' = 'dws_calendar_now_out'
);
满足业务场景的实现路径
通常需要通过实时同步组件通过logminer方式同步Oracle业务数据到kafka组件中
通过flink 实时作业消费kafka数据,并根据借还款余额表作为维表关联方式,在余额基础上根据实时的借还款记录对余额进行增加和减少,并按照天统计当日的借还款
模拟实现代码demo
当前因为缺少业务库数据,同时为了方便构建作业模拟客户场景,通过自定义source,sink方式实现demo来实现实时大屏效果
public static void main(String[] args) {
EnvironmentSettings settings = EnvironmentSettings.inStreamingMode();
TableEnvironment tEnv = TableEnvironment.create(settings);
tEnv.getConfig().addJobParameter("parallelism.default","1");
// DataTypes.TIMESTAMP(3)
tEnv.executeSql("CREATE TABLE source (\n" +
" jiaoyije BIGINT, \n" +
" jiaoyils STRING, \n" +
" jiaoyima STRING, \n" +
" yeziduan STRING, \n" +
" jiaoyirq STRING, \n" +
" SHIJCHUO TIMESTAMP(3)," +
" procTime AS PROCTIME()," +
" WATERMARK FOR SHIJCHUO AS SHIJCHUO - INTERVAL '0' MINUTES\n" +
") WITH (\n" +
" 'connector' = 'bank_credit'" +
")");
tEnv.executeSql("CREATE TABLE dim (\n" +
" ACTUALDAY BIGINT, \n" +
" LENDINGVALUE DOUBLE,\n" +
" LOANVALUE DOUBLE,\n" +
" BALANCEVALUE DOUBLE\n" +
") WITH (\n" +
" 'connector' = 'screen'" +
")");
tEnv.executeSql("CREATE TABLE print (\n" +
" window_start TIMESTAMP(3),\n" +
" window_end TIMESTAMP(3),\n" +
" ACTUALDAY_print BIGINT, \n" +
" LENDINGVALUE_print DOUBLE, \n" +
" LOANVALUE_print DOUBLE, \n" +
" BALANCEVALUE_print DOUBLE \n" +
") WITH (\n" +
" 'connector' = 'print'" +
")");
tEnv.executeSql("" +
"create view temp " +
" as " +
" select " +
" jiaoyije, \n" +
" jiaoyils, \n" +
" jiaoyima, \n" +
" yeziduan, \n" +
" jiaoyirq, \n" +
" SHIJCHUO,\n" +
" BALANCEVALUE\n" +
" from source left join dim FOR SYSTEM_TIME AS OF source.procTime" +
" on cast(DATE_FORMAT(TO_TIMESTAMP_LTZ((UNIX_TIMESTAMP(jiaoyirq,'yyyyMMdd') - 86400),0),'yyyyMMdd') as INT)=dim.ACTUALDAY");
/**
* UNIX_TIMESTAMP(jiaoyirq,'yyyyMMdd') 把指定格式的日期转成距离1970-1-1 00:00:00 的秒数
* UNIX_TIMESTAMP(jiaoyirq,'yyyyMMdd') - 86400 获取前一天, 次类型是BIGINT
*
*/
tEnv.executeSql(
"insert into print select \n" +
" window_start," +
" window_end," +
" cast(jiaoyirq as INT) as ACTUALDAY_print,\n" +
" sum(case when jiaoyima='3020' then jiaoyije else 0 end) as LENDINGVALUE_print,\n" +
" sum(case when jiaoyima<>'3020' then jiaoyije else 0 end) as LOANVALUE_print,\n" +
" (BALANCEVALUE + sum(case when jiaoyima='3020' then jiaoyije else 0 end)" +
" - sum(case when jiaoyima<>'3020' then jiaoyije else 0 end)) as BALANCEVALUE_print\n" +
" FROM TABLE(CUMULATE(" +
" TABLE temp,\n" +
" DESCRIPTOR(SHIJCHUO),\n" +
" INTERVAL '10' SECOND,\n" +
" INTERVAL '1' DAY\n" +
" )) " +
" where yeziduan='ZHCHBJIN' \n" +
" and jiaoyirq = DATE_FORMAT(CURRENT_TIMESTAMP,'yyyyMMdd') " +
" group by jiaoyirq,window_start,window_end,BALANCEVALUE "
);
实现效果:
模拟预置的昨日余额是10000元,今天实时借还款记录都是10元,设计还款多于借款
window_start,window_end,当日日期,当日还款总额,当日借款总额,实时余额
实时余额 = 10000 + 当日还款总额 - 当日借款总额
关键技术点说明
1、要求按照天统计,按照一定频率更新结果
此需要使用累计窗口cumulate,可以定时输出结果,统计值在窗口时间内一直累加
2、需要在前一天的余额上通过借还款记录对实时余额进行计算
这个需要通过lookup join的方式获取到前一天的余额值,在此基础进行计算
上述view 表通过这个方式实现
3、sql中有case when判断借还款,有各种日期处理函数都是系统函数
在 Flink SQL 中,CUMULATE
窗口函数是一种特殊的窗口,它允许你计算从一个起始点到当前时间点的数据累计值。这种窗口对于需要计算实时累计指标(如累计 PV、UV)的场景非常有用。
CUMULATE
窗口的语法如下:
CUMULATE(TABLE table_name, DESCRIPTOR(time_column), INTERVAL 'window_size', INTERVAL 'slide_interval')
table_name
是你的输入表。time_column
是你用来定义窗口的时间列。INTERVAL 'window_size'
定义了窗口的大小。INTERVAL 'slide_interval'
定义了窗口滑动的时间间隔。
例如,如果你想计算每天累计到当前分钟的 PV 和 UV,你可以这样使用 CUMULATE
窗口:
SELECT
date_format(window_start, 'yyyy-MM-dd') cal_day,
behavior,
count(user_id) pv,
count(distinct user_id) uv
FROM TABLE(
CUMULATE(TABLE user_log, DESCRIPTOR(proc_time), INTERVAL '10' SECOND, INTERVAL '1' DAY)
) GROUP BY window_start, window_end, behavior
在这个例子中,CUMULATE
窗口会根据 proc_time
列来计算每10秒钟的累计 PV 和 UV,直到当天的结束。
CUMULATE
窗口的优点是它可以输出一个追加流(append stream),这意味着每个输出都是新增的累计结果,而不会像 RETRACT
流那样有数据的撤回。这使得它非常适合用于实时仪表板和监控系统,因为它们需要连续更新数据。
附录
source表
模拟实时的借还款记录
public class BankCreditDataGenFactory implements DynamicTableSourceFactory {
@Override
public DynamicTableSource createDynamicTableSource(DynamicTableFactory.Context context) {
return new BankCreditDataTableSource();
}
@Override
public String factoryIdentifier() {
return "bank_credit";
}
@Override
public Set<ConfigOption<?>> requiredOptions() {
return new HashSet<>();
}
@Override
public Set<ConfigOption<?>> optionalOptions() {
return new HashSet<>();
}
}
public class BankCreditDataTableSource implements ScanTableSource {
@Override
public ChangelogMode getChangelogMode() {
return ChangelogMode.insertOnly();
}
@Override
public ScanRuntimeProvider getScanRuntimeProvider(ScanContext runtimeProviderContext) {
return SourceFunctionProvider.of(new BankCreditDataSourceFunction(),true);
}
@Override
public DynamicTableSource copy() {
return new BankCreditDataTableSource();
}
@Override
public String asSummaryString() {
return this.getClass().getSimpleName();
}
}
public class BankCreditDataSourceFunction implements SourceFunction {
private final Long[] jiaoyijine = new Long[]{
10l,10l,10l,10l,10l,10l,10l,10l,10l,10l
};
//todo 考虑重复消费处理
private final String[] jiaoyiliushui = new String[]{
"PL19122100000718", "PL19122100000718", "PL19122100000718", "PL19122100000718", "PL19122100000718"
};
//3020 为放款 其他为收款
private final String[] jiaoyima = new String[]{
"ln05", "3020", "3020", "3020", "ln05"
};
//余额字段 ZHCHBJIN 为收款,放款,其它不是
private final String[] yueziduan = new String[]{
"ZHCHBJIN", "ZHCHBJIN", "ZHCHBJIN", "ZHCHBJIN", "AAAAAAAAA"
};
@Override
public void run(SourceContext ctx) throws Exception {
/**
" `jiaoyije` BIGINT, \n" +
" `jiaoyils` STRING, \n" +
" `jiaoyima` STRING, \n" +
" `yeziduan` STRING, \n" +
" `jiaoyirq` STRING, \n" +
" `SHIJCHUO` STRING, \n" +
*/
GenericRowData row = null;
Random random = new Random();
long startTime = System.currentTimeMillis();
while (true) {
if((System.currentTimeMillis() - startTime) > 1000) {
startTime = System.currentTimeMillis();
row = new GenericRowData(6);
row.setField(0, jiaoyijine[random.nextInt(jiaoyijine.length)]);
row.setField(1, StringData.fromString(jiaoyiliushui[random.nextInt(jiaoyiliushui.length)]));
row.setField(2, StringData.fromString(jiaoyima[random.nextInt(jiaoyima.length)]));
row.setField(3, StringData.fromString(yueziduan[random.nextInt(yueziduan.length)]));
row.setField(4, StringData.fromString(LocalDateTime.now().format(DateTimeFormatter.ofPattern("yyyyMMdd"))));
row.setField(5, TimestampData.fromLocalDateTime(
LocalDateTime.now()));
ctx.collect(row);
}
}
}
@Override
public void cancel() {
}
}
dim表
模拟产生借还款余额表数据
public class ScreenDataGenFactory implements DynamicTableSourceFactory {
@Override
public DynamicTableSource createDynamicTableSource(DynamicTableFactory.Context context) {
return new ScreenDataTableSource();
}
@Override
public String factoryIdentifier() {
return "screen";
}
@Override
public Set<ConfigOption<?>> requiredOptions() {
return new HashSet<>();
}
@Override
public Set<ConfigOption<?>> optionalOptions() {
return new HashSet<>();
}
}
public class ScreenDataTableSource implements LookupTableSource,ScanTableSource {
@Override
public DynamicTableSource copy() {
return new ScreenDataTableSource();
}
@Override
public String asSummaryString() {
return this.getClass().getSimpleName();
}
@Override
public LookupRuntimeProvider getLookupRuntimeProvider(LookupContext context) {
return TableFunctionProvider.of(new ScreenDataSourceFunction());
}
@Override
public ChangelogMode getChangelogMode() {
return ChangelogMode.insertOnly();
}
@Override
public ScanRuntimeProvider getScanRuntimeProvider(ScanContext runtimeProviderContext) {
return null;
}
}
public class ScreenDataSourceFunction extends TableFunction<RowData> {
private transient Map<Long,GenericRowData> cache;
@Override
public void open(FunctionContext context) throws Exception {
super.open(context);
/**
* ACTUALDAY BIGINT, -- 实际日期
* LENDINGVALUE DECIMAL, -- 放款值
* LOANVALUE DECIMAL, -- 贷款值
* BALANCEVALUE DECIMAL, -- 余额值
*/
cache = new HashMap<>();
Long yyyyMMddYesterday = Long.valueOf(LocalDateTime.now().minusDays(1l).format(DateTimeFormatter.ofPattern("yyyyMMdd")));
Long yyyyMMddToday = Long.valueOf(LocalDateTime.now().format(DateTimeFormatter.ofPattern("yyyyMMdd")));
GenericRowData yesterday = new GenericRowData(4);
yesterday.setField(0, yyyyMMddYesterday);
yesterday.setField(1, 0.0d);
yesterday.setField(2, 0.0d);
yesterday.setField(3, 10000.0d);
GenericRowData today = new GenericRowData(4);
today.setField(0, yyyyMMddToday);
today.setField(1, 0.0d);
today.setField(2, 0.0d);
today.setField(3, 0.0d);
cache.put(yyyyMMddYesterday,yesterday);
cache.put(yyyyMMddToday,today);
}
/**
* This is a lookup method which is called by Flink framework in runtime.
*
* @param keys lookup keys
*/
public void eval(Object... keys) {
for (Object key : keys) {
if(null !=key){
GenericRowData row = cache.get(Long.valueOf(String.valueOf(key)));
collect(row);
}
}
}
}
print表
此为flink提供的DataGen 就不介绍了
maven依赖
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<parent>
<artifactId>FlinkTableApiTest</artifactId>
<groupId>org.example</groupId>
<version>1.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<artifactId>cumulate</artifactId>
<properties>
<maven.compiler.source>8</maven.compiler.source>
<maven.compiler.target>8</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-simple</artifactId>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-api</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-statebackend-rocksdb</artifactId>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java</artifactId>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-runtime</artifactId>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java</artifactId>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients</artifactId>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-runtime-web</artifactId>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-kafka</artifactId>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-api-java</artifactId>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-base</artifactId>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-planner_2.12</artifactId>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-state-processor-api</artifactId>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-core</artifactId>
</dependency>
</dependencies>
</project>