java伪装反序列化字节流,java.lang.Instantiation在将字节流反序列化为Scala case类对象时发生异常...

I am trying to deserialize an avro byte stream into a scala case class object. Basically, i had a kafka stream with avro encoded data flowing and now there is an addition to the schema and i am trying to update the scala case class to include the new field. The case class looks like this

/** Case class to hold the Device data. */

case class DeviceData(deviceId: String,

sw_version: String,

timestamp: String,

reading: Double,

new_field: Option[String] = None

) {

this() = this("na", "na", "na", 0, None)

}

The avro schema is as follows:

{

"type": "record",

"name": "some_name",

"namespace": "some_namespace",

"fields": [

{

"name": "deviceId",

"type": "string"

},

{

"name": "sw_version",

"type": "string"

},

{

"name": "timestamp",

"type": "string"

},

{

"name": "reading",

"type": "double"

},

{

"name": "new_field",

"type": ["null", "string"],

"default": null

}]}

When the data is received i get the following exception:

java.lang.RuntimeException: java.lang.InstantiationException

I can receive the data just fine a consumer written in python so i know that the data is being streamed correctly in the correct format.

I am suspecting the problem is with the creation of the case class constructor, i have tried doing this:

/** Case class to hold the Device data. */

case class DeviceData(deviceId: String,

sw_version: String,

timestamp: String,

reading: Double,

new_field: Option[String]

) {

this() = this("na", "na", "na", 0, some("na"))

}

but no luck.

The deserializer code is (excerpts):

// reader and decoder for reading avro records

private var reader: DatumReader[T] = null

private var decoder : BinaryDecoder = null

decoder = DecoderFactory.get.binaryDecoder(message, decoder)

reader.read(null.asInstanceOf[T], decoder)

I could not find any other examples of having constructors for case classes which are used for deserializing avro, i had posted a related question last year java.lang.NoSuchMethodException for init method in Scala case class and based on the response i was able to implement my current code which has been working fine ever since.

解决方案

I resolved this problem by following a totally different approach. I used the Confluent Kafka client as provided in this example https://github.com/jfrazee/schema-registry-examples/tree/master/src/main/scala/io/atomicfinch/examples/flink. I also have a Confluent schema registry which is really easy to setup using the containerized all in one solution that comes with kafka and a schema registry https://docs.confluent.io/current/quickstart/ce-docker-quickstart.html.

I had to add confluent dependencies and repo in my pom.xml file. This goes in the repository section.

confluent

http://packages.confluent.io/maven/

This goes in the dependency section:

org.apache.flink

flink-avro-confluent-registry

1.8.0

io.confluent

kafka-avro-serializer

5.2.1

With the code provided in https://github.com/jfrazee/schema-registry-examples/blob/master/src/main/scala/io/atomicfinch/examples/flink/ConfluentRegistryDeserializationSchema.scala i was able to talk to Confluent schema registry and then based on the schema id in the avro message header this downloads the schema from the schema reg and gives me back a GenericRecord object from which i can easily any and all fields of interest and create a new DataStream of the DeviceData object.

val kafka_consumer = new FlinkKafkaConsumer010("prod.perfwarden.minute",

new ConfluentRegistryDeserializationSchema[GenericRecord](classOf[GenericRecord], "http://localhost:8081"),

properties)

val device_data_stream = env

.addSource(kafka_consumer)

.map({x => new DeviceData(x.get("deviceId").toString,

x.get("sw_version").toString,

x.get("timestamp").toString,

x.get("reading").toString.toDouble,

x.get("new_field").toString)})

The confluent kafka client takes care of deserializing the avro bytes stream as per the schema, including the default values. Setting up the schema registry and using the confluent kafka client may take just a little bit of time to get used to but is probably the better long term solution, just my 2 cents.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值