Flink之状态编程

本文详细介绍了Flink的状态编程,包括按键分区状态如值状态、列表状态、映射状态、归约状态和聚合状态的定义及使用案例。此外,还讨论了广播状态的概念及其在实时数仓中的应用。
摘要由CSDN通过智能技术生成


一、按键分区状态(Keyed State)

1.1、值状态(ValueState)

1.1.1、定义

状态中只保存一个“值”(value)。ValueState本身是一个接口,源码中定义如下:

public interface ValueState<T> extends State {
T value() throws IOException;
void update(T value) throws IOException;
}

1.1.2、使用案例

利用ValueState和定时器每10秒输出一次用户的pv量

package com.hpsk.flink.state;

import com.hpsk.flink.beans.Event;
import com.hpsk.flink.source.EventWithWatermarkSource;
import org.apache.flink.api.common.state.ValueState;
import org.apache.flink.api.common.state.ValueStateDescriptor;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.KeyedProcessFunction;
import org.apache.flink.util.Collector;
import java.sql.Timestamp;

public class ValueStateDS {
   
    public static void main(String[] args) throws Exception {
   
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        env.setParallelism(1);

        DataStreamSource<Event> stream = env.addSource(new EventWithWatermarkSource());

        SingleOutputStreamOperator<String> result = stream.keyBy(t -> t.user)
                .process(new KeyedProcessFunction<String, Event, String>() {
   
                    // 定义两个状态,保存当前 pv 值,以及定时器时间戳
                    private ValueState<Long> valueState;
                    private ValueState<Long> timerTsState;

                    @Override
                    public void open(Configuration parameters) throws Exception {
   
                        valueState = getRuntimeContext().getState(new ValueStateDescriptor<Long>("value-state", Long.class));
                        timerTsState = getRuntimeContext().getState(new ValueStateDescriptor<Long>("timerTs", Long.class));

                    }

                    @Override
                    public void processElement(Event value, Context ctx, Collector<String> collector) throws Exception {
   
                        Long count = valueState.value();
                        if (count == null) {
   
                            valueState.update(1L);
                        } else {
   
                            valueState.update(count + 1);
                        }

                        // 注册定时器
                        if (timerTsState.value() == null) {
   
                            ctx.timerService().registerEventTimeTimer(value.timestamp + 10 * 1000L);
                            timerTsState.update(value.timestamp + 10 * 1000L);
                        }
                    }

                    @Override
                    public void onTimer(long timestamp, OnTimerContext ctx, Collector<String> out) throws Exception {
   
                        out.collect("时间:"+new Timestamp(timestamp) + " ->用户:"+  ctx.getCurrentKey() + "的pv值为:" + valueState.value());
                        timerTsState.clear();
                    }
                });
        result.print(">>>>");

        env.execute();
    }
}

1.2、列表状态(ListState)

1.2.1、定义

将需要保存的数据,以列表(List)的形式组织起来。在 ListState接口中同样有一个类型参数 T,表示列表中数据的类型。ListState 也提供了一系列的方法来操作状态,使用方式
与一般的 List 非常相似。

1.2.2、使用案例

利用ListState进行实现sql中的join操作

package com.hpsk.flink.state;

import org.apache.flink.api.common.eventtime.SerializableTimestampAssigner;
import org.apache.flink.api.common.eventtime.WatermarkStrategy;
import org.apache.flink.api.common.state.ListState;
import org.apache.flink.api.common.state.ListStateDescriptor;
import org.apache.flink.api.common.typeinfo.Types;
import org.apache.flink.api.java.tuple.Tuple3;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.co.CoProcessFunction;
import org.apache.flink.util.Collector;

public class ListStateDS {
   
    public static void main(String[] args) throws Exception {
   
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        env.setParallelism(1);
        SingleOutputStreamOperator<Tuple3<String, String, Long>> stream1 = env.fromElements(
                Tuple3.of("a", "stream-1", 1000L),
                Tuple3.of("b", "stream-1", 2000L)
        ).assignTimestampsAndWatermarks(WatermarkStrategy.<Tuple3<String, String, Long>>forMonotonousTimestamps().withTimestampAssigner(
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值