spark解决org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow

使用spark sql的thrift jdbc接口查询数据时报这个错误

Exception in thread "main" java.sql.SQLException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3107 in stage 308.0 failed 4 times, most recent failure: Lost task 3107.3 in stage 308.0 (TID 620318, XXX): org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 1572864, required: 3236381
Serialization trace:
values (org.apache.spark.sql.catalyst.expressions.GenericInternalRow). To avoid this, increase spark.kryoserializer.buffer.max value.
        at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:299)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:240)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
        at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:275)
        at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:355)
        at com.peopleyuqing.tool.SparkJDBC.excuteQuery(SparkJDBC.java:64)
        at com.peopleyuqing.main.ContentSubThree.main(ContentSubThree.java:24)

提示需要调整的参数是spark.kryoserializer.buffer.max,最少是需要3236381 
一开始设置是在spark-default.conf文件里面配置

spark.kryoserializer.buffer.max=64m
spark.kryoserializer.buffer=64k

错误依旧而且Available变成了0

Exception in thread "main" java.sql.SQLException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3155 in stage 0.0 failed 4 times, most recent failure: Lost task 3155.3 in stage 0.0 (TID 3317, XXX): org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 615328
Serialization trace:
values (org.apache.spark.sql.catalyst.expressions.GenericInternalRow). To avoid this, increase spark.kryoserializer.buffer.max value.
        at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:299)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:240)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

后来通过spark-shell测试 
sc.getConf.get(“spark.kryoserializer.buffer.max”)发现返回是设置的值64M 
说明spark-sql thrift jdbc的配置不收spark-conf.default的影响,遂修改配置方式,增加了这样两个启动参数

--conf  spark.kryoserializer.buffer.max=256m  --conf spark.kryoserializer.buffer=64m

启动命令如下

sbin/start-thriftserver.sh --executor-memory 10g  --driver-memory 12g --total-executor-cores 288 --executor-cores 2 --conf spark.kryoserializer.buffer.max=256m  --conf spark.kryoserializer.buffer=64m 

问题解决

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值