消费kafka数据时,要打印输出的InputDStream[ConsumerRecord[String, String]]对象没有序列化
ERROR — [ Executor task launch worker for task 0] org.apache.spark.executor.Executor (line: 94) : Exception in task 0.0 in stage 0.0 (TID 0)
java.io.NotSerializableException: org.apache.kafka.clients.consumer.ConsumerRecord
Serialization stack:
- object not serializable (class: org.apache.kafka.clients.consumer.ConsumerRecord, value: ConsumerRecord(topic = GMALL_ORDER, partition = 0, leaderEpoch = 35, offset = 77, CreateTime = 1638873249062, serialized key size = -1, serialized value size = 419, headers = RecordHeaders(headers = [], isReadOnly = false), key = null, value = {“payment_way”:“1”,“delivery_address”:“MYJyqgMqsSipZafAmHBI”,“consignee”:“xYCNkL”,“create_time”:“2021-12-07 18:53:54”,“order_comment”:“MLwWujKXIzzVhXrSoRsn”,“expire_time”:"",“order_status”:“2”,“out_trade_no”:“2440957357”,“tracking_no”:"",“total_amount”:“934.0”,“user_id”:“5”,“img_url”:"",“province_id”:“1”,“consignee_tel”:“13977376982”,“trade_body”:"",“id”:“1”,“parent_order_id”:"",“operate_time”:“2021-12-07 19:39:34”}))
- element of array (index: 0)
- array (class [Lorg.apache.kafka.clients.consumer.ConsumerRecord;, size 3)
at org.apache.spark.serializer.SerializationDebugger
.
i
m
p
r
o
v
e
E
x
c
e
p
t
i
o
n
(
S
e
r
i
a
l
i
z
a
t
i
o
n
D
e
b
u
g
g
e
r
.
s
c
a
l
a
:
41
)
a
t
o
r
g
.
a
p
a
c
h
e
.
s
p
a
r
k
.
s
e
r
i
a
l
i
z
e
r
.
J
a
v
a
S
e
r
i
a
l
i
z
a
t
i
o
n
S
t
r
e
a
m
.
w
r
i
t
e
O
b
j
e
c
t
(
J
a
v
a
S
e
r
i
a
l
i
z
e
r
.
s
c
a
l
a
:
47
)
a
t
o
r
g
.
a
p
a
c
h
e
.
s
p
a
r
k
.
s
e
r
i
a
l
i
z
e
r
.
J
a
v
a
S
e
r
i
a
l
i
z
e
r
I
n
s
t
a
n
c
e
.
s
e
r
i
a
l
i
z
e
(
J
a
v
a
S
e
r
i
a
l
i
z
e
r
.
s
c
a
l
a
:
101
)
a
t
o
r
g
.
a
p
a
c
h
e
.
s
p
a
r
k
.
e
x
e
c
u
t
o
r
.
E
x
e
c
u
t
o
r
.improveException(SerializationDebugger.scala:41) at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:47) at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:101) at org.apache.spark.executor.Executor
.improveException(SerializationDebugger.scala:41)atorg.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:47)atorg.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:101)atorg.apache.spark.executor.ExecutorTaskRunner.run(Executor.scala:489)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Exception in thread “main” org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0 in stage 0.0 (TID 0) had a not serializable result: org.apache.kafka.clients.consumer.ConsumerRecord
Serialization stack:
- object not serializable (class: org.apache.kafka.clients.consumer.ConsumerRecord, value: ConsumerRecord(topic = GMALL_ORDER, partition = 1, leaderEpoch = 24, offset = 75, CreateTime = 1638874873654, serialized key size = -1, serialized value size = 418, headers = RecordHeaders(headers = [], isReadOnly = false), key = null, value = {“payment_way”:“2”,“delivery_address”:“xMpqKDLTjskWeFQmMYIt”,“consignee”:“VcynUI”,“create_time”:“2021-12-07 21:25:43”,“order_comment”:“VyDvvRUYkeSYoBQAeUlw”,“expire_time”:"",“order_status”:“2”,“out_trade_no”:“4111659192”,“tracking_no”:"",“total_amount”:“76.0”,“user_id”:“4”,“img_url”:"",“province_id”:“3”,“consignee_tel”:“13637737415”,“trade_body”:"",“id”:“2”,“parent_order_id”:"",“operate_time”:“2021-12-07 22:15:47”}))
- element of array (index: 0)
- array (class [Lorg.apache.kafka.clients.consumer.ConsumerRecord;, size 3)
at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2023)
解决方法:
创建sparkConf时设置序列化属性即可
set(“spark.serializer”,“org.apache.spark.serializer.KryoSerializer”)
val conf = new SparkConf()
.setMaster(“local[*]”)
.setAppName(“ssc”)
.set(“spark.serializer”,“org.apache.spark.serializer.KryoSerializer”)