Flink 批作业消费kafka

最新推荐文章于 2024-07-23 11:11:07 发布

wending-Y

最新推荐文章于 2024-07-23 11:11:07 发布

阅读量1.1k

点赞数

分类专栏： Flink 入门到实践文章标签： flink kafka 大数据

本文链接：https://blog.csdn.net/qq_22222499/article/details/125649939

版权

Flink 入门到实践专栏收录该内容

48 篇文章 6 订阅

订阅专栏

文章目录

kafka 数据源可以是有界数据源，也可以是无界数据源

示例代码


    public static void main(String[] args) {
        StreamExecutionEnvironment env
                = StreamExecutionEnvironment.getExecutionEnvironment();
        env.setParallelism(1);

        //设置运行方式为批处理
        env.setRuntimeMode(RuntimeExecutionMode.BATCH);


        //设置消费的开始时间戳和结束的时间戳

        KafkaSource<String> source = KafkaSource.<String>builder()
                .setBootstrapServers("localhost:9092")
                .setTopics("input-topic")
                .setGroupId("my-group")
                .setStartingOffsets(OffsetsInitializer.timestamp(1657038028000L))
                .setBounded(OffsetsInitializer.timestamp(1657120828000l))
                .setValueOnlyDeserializer(new SimpleStringSchema())
                .build();

        env.fromSource(source, WatermarkStrategy.noWatermarks(), "Kafka Source").print();


        try {
            env.execute("batch-kafka-test");
        } catch (Exception e) {
            e.printStackTrace();
        }


    }

原理

把时间戳转换成offset
KafkaPartitionSplitReader 类

  private void acquireAndSetStoppingOffsets(
            List<TopicPartition> partitionsStoppingAtLatest,
            Set<TopicPartition> partitionsStoppingAtCommitted) {
            //设置end offset
        Map<TopicPartition, Long> endOffset = consumer.endOffsets(partitionsStoppingAtLatest);
        stoppingOffsets.putAll(endOffset);
        if (!partitionsStoppingAtCommitted.isEmpty()) {
            consumer.committed(partitionsStoppingAtCommitted)
                    .forEach(
                            (tp, offsetAndMetadata) -> {
                                Preconditions.checkNotNull(
                                        offsetAndMetadata,
                                        String.format(
                                                "Partition %s should stop at committed offset. "
                                                        + "But there is no committed offset of this partition for group %s",
                                                tp, groupId));
                                stoppingOffsets.put(tp, offsetAndMetadata.offset());
                            });
        }
    }