遇到的问题
kafka使用kerberos安全认证后,我这边的消费程序需要修改。原本如果是普通的消费程序,加两行代码就行了:
System.setProperty("java.security.auth.login.config", kafkaJaasPath);
System.setProperty("java.security.krb5.conf", krb5Path);
但是我的程序是用spark streaming框架写的。原本在我的idea上跑local跑的好好的,结果一上到spark standalone集群就崩了,报了下面的错误:
Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka consumer
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:702)
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:557)
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:540)
at org.apache.spark.streaming.kafka010.CachedKafkaConsumer.<init>(CachedKafkaConsumer.scala:45)
at org.apache.spark.streaming.kafka010.CachedKafkaConsumer$.get(CachedKafkaConsumer.scala:194)
at org.apache.spark.streaming.kafka010.KafkaRDDIterator.<init>(KafkaRDD.scala:252)
at org.apache.spark.streaming.kafka010.KafkaRDD.compute(KafkaRDD.scala:212)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
... 3 more
Caused by: org.apache.kafka.common.KafkaException: org.apache.kafka.common.KafkaException: Jaas configuration not found
at org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:86)
at org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:70)
at org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:83)
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:623)
... 20 more
Caused by: org.apache.kafka.common.KafkaException: Jaas configuration not found
at org.apache.kafka.common.security.kerberos.KerberosLogin.getServiceName(KerberosLogin.java:299)
at org.apache.kafka.common.security.kerberos.KerberosLogin.configure(KerberosLogin.java:103)
at org.apache.kafka.common.security.authenticator.LoginManager.<init>(LoginManager.java:45)
at org.apache.kafka.common.security.authenticator.LoginManager.acquireLoginManager(LoginManager.java:68)
at org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:78)
... 23 more
Caused by: java.io.IOException: Could not find a 'KafkaClient' entry in this configuration.
at org.apache.kafka.common.security.JaasUtils.jaasConfig(JaasUtils.java:50)
at org.apache.kafka.common.security.kerberos.KerberosLogin.getServiceName(KerberosLogin.java:297)
... 27 more
报这个错的原因是:虽然使用System.setProperty设了环境变量,但是程序跑到spark集群上时,是分了driver和executor的。测试时在自己电脑跑的单机模式,所以程序能读到配置文件。但是到了spark集群,消费者是在executor上实例化的,而executor是读取不到driver上设的环境变量的,所以才报了这个错。
解决办法
在main函数里添加下面的代码:
System.setProperty("java.security.auth.login.config", kafkaJaasPath);
System.setProperty("java.security.krb5.conf", krb5Path);
sparkConf.set("spark.executor.extraJavaOptions", "-Djava.security.auth.login.config="+kafkaJaasPath);
spark-submit提交命令加上–file参数知道jaas.conf的全路径
--files /etc/kafka/conf/jaas.conf
参数说明:
以下是spark-submit --help显示的解释
--files FILES Comma-separated list of files to be placed in the working
directory of each executor. File paths of these files
in executors can be accessed via SparkFiles.get(fileName).
------------------补充
运行消费程序之前每台机都需要使用kinit先授权