flink获取kafka的key value timestamp header

这篇博客介绍了如何在使用Flink从Kafka消费数据时,获取Kafka消息的timestamp,并通过对比计算Flink处理数据的延迟。示例代码展示了如何自定义KafkaDeserializationSchema来解析ConsumerRecord,从中提取timestamp和其他信息。
摘要由CSDN通过智能技术生成

flilnk在消费kafka数据的时候,我们习惯性的add一个kafkaConsumer,似乎万年不变,但是单我们需求变化的时候,我们该怎么办?

DataStreamSource<String> stream = env.addSource(new FlinkKafkaConsumer<String>(
        "clicks",
        new SimpleStringSchema(),
        properties
));
stream.print("Kafka");

注意 这里的DataStreamSource 的类型是string,而这个string就是kafka的value

 现在有需求,我们要获取kafka的的timestamp,然后跟进入flink的时间的对比,看flink处理数据的延迟时间有多少?

那么我们如何获取这个timstamp呢?

 最简单的办法,百度。 这位是用scala写的。

flink读取kafka中的数据的所有信息_第一片心意的博客-CSDN博客_flink读取kafka的数据

我就用java写了个。


import org.apache.flink.api.common.serialization.SimpleStringSchema;
import org.apache.flink.api.common.typeinfo.TypeInformation;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer;
import org.apache.flink.streaming.connectors.kafka.KafkaDeserializationSchema;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.common.header.Header;
import org.apache.kafka.common.header.Headers;

import java.text.SimpleDateFormat;
import java.util.*;

public class SourceKafkaConsumerRecordTest {
    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        env.setParallelism(1);
        Properties properties = new Properties();
        properties.setProperty("bootstrap.servers", "xxxxxx1:9092");
        properties.setProperty("group.id", "consumer-group");
        properties.setProperty("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        properties.setProperty("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        properties.setProperty("auto.offset.reset", "latest");
        FlinkKafkaConsumer<String> consumer = new FlinkKafkaConsumer<>(
                "ia-label",
                new SimpleStringSchema(),
                properties
        );
        FlinkKafkaConsumer<MyConsumerRecord> consumer2 = new FlinkKafkaConsumer<>(
                "ia-label",
                new KafkaDeserializationSchema<MyConsumerRecord>() {
                    @Override
                    public boolean isEndOfStream(MyConsumerRecord s) {
                        return false;
                    }
                    @Override
                    public MyConsumerRecord deserialize(ConsumerRecord<byte[], byte[]> consumerRecord) throws Exception {
                        Headers headers = consumerRecord.headers();
                        HashMap<String, String> headerMap = new HashMap<>();
                        for (Header header : headers) {
                            headerMap.put(header.key(),new String(header.value()));
                        }
                        byte[] key1 = consumerRecord.key();
                        byte[] value1 = consumerRecord.value();
                        String key = key1==null?null:new String(key1);
                        String value = new String(value1);
                        String timeStamp=new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").format(consumerRecord.timestamp());
                        MyConsumerRecord myConsumerRecord = new MyConsumerRecord(key, value, timeStamp, headerMap);
                        System.out.println(myConsumerRecord);
                        return myConsumerRecord;
                    }

                    @Override
                    public TypeInformation<MyConsumerRecord> getProducedType() {
                        return TypeInformation.of(MyConsumerRecord.class);
                    }
                },
                properties
        );
        consumer2.setStartFromEarliest();
        consumer.setStartFromEarliest();
//        DataStreamSource<String> stream = env.addSource(consumer);\
//        stream.print("Kafka");
        DataStreamSource<MyConsumerRecord> stream2 = env.addSource(consumer2);
        stream2.print("All");
        env.execute();
    }

    static class MyConsumerRecord{
        String key;
        String value;
        String timeStamp;
        Map<String,String> header;

        public MyConsumerRecord(String key, String value, String timeStamp, Map<String, String> header) {
            this.key = key;
            this.value = value;
            this.timeStamp = timeStamp;
            this.header = header;
        }

        @Override
        public String toString() {
            return "MyConsumerRecord{" +
                    "key='" + key + '\'' +
                    ", value='" + value + '\'' +
                    ", timeStamp='" + timeStamp + '\'' +
                    ", header=" + header +
                    '}';
        }
    }
}

打印输出如下

 ok。 还可以获取 partition offset 这些信息,看自己了。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值