记一次Spark Kerberos的故障解决

有同事反馈,Livy Server启动的所有Spark AM失败。Livy启动的Spark AM默认会enableHiveSupport,且使用$LIVY_HOME/conf/livy.conf的如下配置作为spark.yarn.keytab和spark.yarn.kerberos。

livy.server.launch.kerberos.keytab

livy.server.launch.kerberos.principal

由于不知道报错信息,查看Spark AM log,有所发现:

Attempting to login to Kerberos using principal: ...

...

GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]

说明spark.yarn.principal和spark.yarn.keytab已经设置,但是校验Kerberos ticket失败了。根据错误信息,对照Spark-Hive的代码HiveClientImpl.scala.

 

// Set up kerberos credentials for UserGroupInformation.loginUser within
// current class loader
if (sparkConf.contains("spark.yarn.principal") && sparkConf.contains("spark.yarn.keytab")) {
  val principalName = sparkConf.get("spark.yarn.principal")
  val keytabFileName = sparkConf.get("spark.yarn.keytab")
  if (!new File(keytabFileName).exists()) {
    throw new SparkException(s"Keytab file: ${keytabFileName}" +
      " specified in spark.yarn.keytab does not exist")
  } else {
      logInfo("Attempting to login to Kerberos" +
          s" using principal: ${principalName} and keytab: ${keytabFileName}")
      UserGroupInformation.loginUserFromKeytab(principalName, keytabFileName)
  }
}

 

UserGroupInformation.loginUserFromKeytab有如下方法调用。若isSecurityEnabled()返回false,跳过初始化。且loginUserFromKeytab会打印log:Login successful for user ... 。Spark AM日志未见此条Log。怀疑加载了默认Hadoop Configuration对象。

public static boolean isSecurityEnabled() {
  return !isAuthenticationMethodEnabled(AuthenticationMethod.SIMPLE);
}

private static boolean isAuthenticationMethodEnabled(AuthenticationMethod method) {
  ensureInitialized();
  return (authenticationMethod == method);
}

private static synchronized void ensureInitialized() {
  if (conf == null) 
    initialize(new Configuration(), false);
  }
}

于是询问该同事,是否有任何配置变更,最终确定有人变动了Hadoop *-site.xml的目录。于是修改HADOOP_CONF_DIR,重启Livy Server。问题解决。

总结:

1. 线上环境的任何变更,必须评估其影响面,且通知到相关同事。

2. 充分了解故障发生前的一些情况,有助于快速定位故障原因。

 

 

 

展开阅读全文

没有更多推荐了,返回首页