Flink SQL 使用ROW_NUMBER()报错

最新推荐文章于 2024-07-31 06:29:26 发布

Cola、

最新推荐文章于 2024-07-31 06:29:26 发布

阅读量6.5k

点赞数

分类专栏： flink

本文链接：https://blog.csdn.net/sinat_24296421/article/details/116032186

版权

flink 专栏收录该内容

6 篇文章 1 订阅

订阅专栏

博客作者在使用Flink SQL进行实时数据统计时遇到ROW_NUMBER()窗口函数的问题，报错信息提示不支持特定类型的变更。通过查阅资料和尝试，作者发现问题在于子查询和外层groupby都使用了窗口函数。解决方案是去除窗口函数，改用dateKey限定时间范围，并设置状态存储的最大闲置时间。修改后的SQL能够正常运行，但需在不同环境中进一步验证。

摘要由CSDN通过智能技术生成

项目使用Flink SQL做实时数据统计，用到ROW_NUMBER()，特意看了下官网，支持，嗯，好开心。然而实际上还是没有顺利的实现。

先看代码设置，代码中使用的处理时间为处理时间（TimeCharacteristic.ProcessingTime），使用窗口为滚动窗口，统计每天00:00:00-23:59:59时间范围内的数据。

建表

    //准备环境
    StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    env.getConfig().setGlobalJobParameters(param);
    env.enableCheckpointing(5 * 60 * 1000);
    env.getCheckpointConfig().setMinPauseBetweenCheckpoints(2000);
    env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE);
    env.getCheckpointConfig().setCheckpointTimeout(60000);
    env.getCheckpointConfig().setMaxConcurrentCheckpoints(1);
    env.setStreamTimeCharacteristic(TimeCharacteristic.ProcessingTime);

    //设置checkPoint失败容忍度
    env.getCheckpointConfig().setTolerableCheckpointFailureNumber(5);

    //设置状态存储后端
    RocksDBStateBackend dbStateBackend = new RocksDBStateBackend("file:///zxj/flink_checkpoint/state", true);

    //创建TableEnvironment
    StreamTableEnvironment bsTableEnv = StreamTableEnvironment.create(env);

    bsTableEnv.createTemporaryView("test_crm", testDataStream
                    ,$("companyId")
                    ,$("positionId")
                    ,$("orgId")
                    ,$("orgPath")
                    ,$("isLogin")
                    ,$("isVisit")
                    ,$("isLoginVisit")
                    ,$("isPrivate")
                    ,$("isFamily")
                    ,$("rowTime")
                    ,$("dateKey")
                    ,$("currentDate")
                    ,$("proctime").proctime()
            );

    //准备sql,sql 列在下面
    String bdSql = "";

    Table bdResult = bsTableEnv.sqlQuery(bdSql);

    DataStream<Tuple2<Boolean, Row>> bdResultStream = bsTableEnv.toRetractStream(bdResult, Row.class);

    bdResultStream.print();

FlinkSQL如下

    select 
        positionId
        , count(distinct companyId) filter(where (isFamily = 0 and isLogin = 1)) as login_company_num
    from ( 
        select 
            *  
            ,ROW_NUMBER() OVER(PARTITION BY companyId order by proctime desc) as rownum 
        from table_test 
    ) a 
    where rownum = 1 
    group by positionId, tumble(proctime, interval '1' day)

报错信息

org.apache.flink.table.api.TableException: GroupWindowAggregate doesn't support consuming update and delete changes which is produced by node Rank(strategy=[UndefinedStrategy], rankType=[ROW_NUMBER], rankRange=[rankStart=1, rankEnd=1], partitionBy=[companyId], orderBy=[rowTime DESC], select=[companyId, positionId, isLogin, isFamily, rowTime])
	at org.apache.flink.table.planner.plan.optimize.program.FlinkChangelogModeInferenceProgram$SatisfyModifyKindSetTraitVisitor.createNewNode(FlinkChangelogModeInferenceProgram.scala:355)

然后，开始百度

看到我认为最明白的一篇，原博链接：https://blog.csdn.net/wangpei1949/article/details/105472473/

我和博主的区别，也就是处理事件类型不一样，但这不应该影响row_number的使用。到这里还一直以为是row_number没有用对姿势，知道我放弃row_number，sql换了另一种写法，也报了相同的错误。

然后发现，子查询和外层group by都使用了窗口函数，去掉窗口函数，OK正常了。

改动点：

不在使用窗口，group by后面去掉 , tumble(rowTime, interval '1' day)
Flink SQL改为用dateKey来限定时间范围：每条数据的c_t转换为20200428的格式，SQL查询直接用dateKey=currentDateKey
Flink SQL状态设置最大闲置时间，不然数据会越来越多，总有一天任务会挂掉闲置时间设置（这里最大时间设置是否合理有待验证，相关文档介绍参考：https://zhuanlan.zhihu.com/p/98788196）
```
bsTableEnv.getConfig().getConfiguration().addAll(tConfig);
bsTableEnv.getConfig().setIdleStateRetentionTime(Time.hours(24), Time.hours(25));
```

修改后SQL如下：

    select 
        positionId
        , count(distinct companyId) filter(where (isFamily = 0 and isLogin = 1)) as login_company_num
    from ( 
        select 
            *  
            ,ROW_NUMBER() OVER(PARTITION BY companyId order by proctime desc) as rownum 
        from table_test 
        where dateKey = currentDate
    ) a 
    where rownum = 1 
    group by positionId

本地测试正常，测试环境待验证，生产环境待验证。

文章有不妥之处还请大佬指点再走，谢！