FlinkCDC写入kafka计算后写入写出hbase-工作实例

for your wish

已于 2022-08-04 18:42:25 修改

阅读量2.9k

点赞数

文章标签： flink

于 2022-03-25 18:15:28 首次发布

本文链接：https://blog.csdn.net/someinneed/article/details/123742362

版权

Flink 专栏收录该内容

40 篇文章 6 订阅 ¥29.90 ¥99.00

订阅专栏

超级会员免费看

设计思路：

事实表走kafka触发数据的流动，维表变化缓慢留在hbase。两边join得出结果，

存在的问题：

如果多个事实表走kafka，存在kafka中数据只保存七天的，有超时数据关联不上的问题。但是如果一个事实表在kakfa，一个事实表在hbase，实际上hbase中的数据依然是流写入的，依然会有数据晚到的问题，kafka中的数据关联不到hbase事实数据，没有补偿机制就不行。只能实时计算出来不太准的结果，等T+1的批数据跑完了再用离线回灌掉这个不太准的数据。

思考：

Flink的计算不如用单一事实表关联变化缓慢的hbase维表，得出来的数据insert into到hbase这样的事实表（有rowkey能保证唯一）。两张高频变化事实表的join是不是不太适合，把一张事实表关联维表的临时逻辑入湖，多张临时的这种表在湖内准实时组装起来

POM文件

<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements.  See the NOTICE file
distributed with this work for additional information
regarding copyright ownership.  The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not u

了解本专栏

超级会员免费看

for your wish

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
FlinkCDC写入kafka计算后写入写出hbase-工作实例

POM文件<!--Licensed to the Apache Software Foundation (ASF) under oneor more contributor license agreements. See the NOTICE filedistributed with this work for additional informationregarding copyright ownership. The ASF licenses this fileto you.
复制链接

扫一扫