Flink消费Kafka:自定义KafkaDeserializationSchema

1 篇文章 0 订阅
1 篇文章 0 订阅

一、pom文件中添加依赖:

    <dependencies>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-java</artifactId>
            <version>${flink.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-scala_2.11</artifactId>
            <version>${flink.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-streaming-java_2.11</artifactId>
            <version>${flink.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-streaming-scala_2.11</artifactId>
            <version>${flink.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-clients_2.11</artifactId>
            <version>${flink.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-connector-wikiedits_2.11</artifactId>
            <version>${flink.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-connector-kafka-0.10_2.11</artifactId>
            <version>${flink.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.kafka</groupId>
            <artifactId>kafka_2.11</artifactId>
            <version>0.10.2.0</version>
            <exclusions>
                <exclusion>
                    <groupId>org.slf4j</groupId>
                    <artifactId>slf4j-log4j12</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>log4j</groupId>
                    <artifactId>log4j</artifactId>
                </exclusion>
            </exclusions>
        </dependency>


        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-simple</artifactId>
            <version>1.7.25</version>
            <scope>compile</scope>
        </dependency>
    </dependencies>

二、代码

import com.lzw.example.utils.ServiceConf
import com.lzw.example.utils.serialization.RecordKafkaSchema
import org.apache.flink.api.scala._
import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer010
import org.apache.kafka.clients.consumer.ConsumerRecord
import org.apache.flink.api.java.tuple.Tuple2

object KafkaCount {
  def main(args: Array[String]): Unit = {
    val env = StreamExecutionEnvironment.createLocalEnvironment()
    val kafka:Properties = ServiceConf.getRes("/kafka.properties")
    // val schema = new TypeInformationKeyValueSerializationSchema(classOf[String], classOf[String], env.getConfig)
    // val schema =new SimpleStringSchema()
    // val schema = new CustomKafkaSchema
    val schema = new RecordKafkaSchema
    val kafkaConsumer = new FlinkKafkaConsumer010[ConsumerRecord[String, String]]("sgz_bi_player3_2", schema, kafka)
    kafkaConsumer.setStartFromEarliest()
    val value = env.addSource(kafkaConsumer)
    value.print()
    env.execute()
  }
  
}
kafka.properties的内容:
bootstrap.servers=bigdata151:9092,bigdata152:9092,bigdata153:9092,bigdata127:9092,bigdata128:9092,bigdata129:9092
#zookeeper.connect=bigdata153:2181,bigdata152:2181,bigdata151:2181
group.id=flink

问题:使用Flink已经定义好的反序列化shema

1、SimpleStringSchema:返回的结果只有Kafka的value,而没有其它信息:

    val schema =new SimpleStringSchema()
    val kafkaConsumer = new FlinkKafkaConsumer010[String]("sgz_bi_player3_2", schema, kafka)

2、TypeInformationKeyValueSerializationSchema:返回的结果只有Kafka的key,value,而没有其它信息:

    val schema = new TypeInformationKeyValueSerializationSchema(classOf[String], classOf[String], env.getConfig)
    val kafkaConsumer = new FlinkKafkaConsumer010[Tuple2[String,String]]("sgz_bi_player3_2", schema, kafka)

3、很多时候我们需要获得Kafka的topic或者其它信息,就需要通过实现KafkaDeserializationSchema接口来自定义返回数据的结构:

val schema = new RecordKafkaSchema
val kafkaConsumer = new FlinkKafkaConsumer010[ConsumerRecord[String, String]]("sgz_bi_player3_2", schema, kafka)

RecordKafkaSchema.scala:


import org.apache.flink.api.common.typeinfo.{TypeHint, TypeInformation}
import org.apache.flink.streaming.connectors.kafka.KafkaDeserializationSchema
import org.apache.kafka.clients.consumer.ConsumerRecord

/**
 * @Author LZW
 * @Date 2020/1/17 16:30
 **/

class RecordKafkaSchema extends KafkaDeserializationSchema[ConsumerRecord[String, String]] {

  override def isEndOfStream(nextElement: ConsumerRecord[String, String]): Boolean = false

  override def deserialize(record: ConsumerRecord[Array[Byte], Array[Byte]]): ConsumerRecord[String, String] = {
    var key: String = null
    var value: String = null
    if (record.key != null) {
      key = new String(record.key())
    }
    if (record.value != null) {
      value = new String(record.value())
    }
    new ConsumerRecord[String, String](
      record.topic(),
      record.partition(),
      record.offset(),
      record.timestamp(),
      record.timestampType(),
      record.checksum,
      record.serializedKeySize,
      record.serializedValueSize(),
      key,
      value)
  }

  override def getProducedType: TypeInformation[ConsumerRecord[String, String]] = TypeInformation.of(new TypeHint[ConsumerRecord[String, String]] {})
}

 

 
  • 3
    点赞
  • 13
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值