Kerberos验证失败:
提示:这里简述项目相关背景:
问题描述
Kerberos验证失败,原因是票据过期了
例如:数据传输过程中数据不时出现丢失的情况,偶尔会丢失一部分数据
APP 中接收数据代码:
sudo -i -u hdfs hdfs haadmin -getServiceState nn1
2023-07-03 14:20:45,594 WARN security.UserGroupInformation: Exception encountered while running the renewal command for hdfs/hadoop@EXAMPLE.COM. (TGT end time:1688179347000, renewalFailures: 0,renewalFailuresTotal: 1)
ExitCodeException exitCode=1: kinit: Ticket expired while renewing credentials
at org.apache.hadoop.util.Shell.runCommand(Shell.java:1009)
at org.apache.hadoop.util.Shell.run(Shell.java:902)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:1321)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:1303)
at org.apache.hadoop.security.UserGroupInformation$AutoRenewalForUserCredsRunnable.run(UserGroupInformation.java:899)
at java.lang.Thread.run(Thread.java:750)
2023-07-03 14:20:45,612 ERROR security.UserGroupInformation: TGT is expired. Aborting renew thread for hdfs/hadoop@EXAMPLE.COM.
2023-07-03 14:20:50,606 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 60 seconds before. Last Login=1688365245754
2023-07-03 14:20:52,435 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 60 seconds before. Last Login=1688365245754
2023-07-03 14:20:52,767 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 60 seconds before. Last Login=1688365245754
2023-07-03 14:20:53,656 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 60 seconds before. Last Login=1688365245754
2023-07-03 14:20:54,741 WARN ipc.Client: Couldn't setup connection for hdfs/hadoop@EXAMPLE.COM to hadoop102/172.16.10.137:8020
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:408)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:627)
at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:421)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:814)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:810)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:810)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:421)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1606)
at org.apache.hadoop.ipc.Client.call(Client.java:1435)
at org.apache.hadoop.ipc.Client.call(Client.java:1388)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
at com.sun.proxy.$Proxy8.getServiceStatus(Unknown Source)
at org.apache.hadoop.ha.protocolPB.HAServiceProtocolClientSideTranslatorPB.getServiceStatus(HAServiceProtocolClientSideTranslatorPB.java:136)
at org.apache.hadoop.ha.HAAdmin.getServiceState(HAAdmin.java:409)
at org.apache.hadoop.ha.HAAdmin.runCmd(HAAdmin.java:510)
at org.apache.hadoop.hdfs.tools.DFSHAAdmin.runCmd(DFSHAAdmin.java:121)
at org.apache.hadoop.ha.HAAdmin.run(HAAdmin.java:434)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.hdfs.tools.DFSHAAdmin.main(DFSHAAdmin.java:135)
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:162)
at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:189)
at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
... 24 more
Operation failed: DestHost:destPort hadoop102:8020 , LocalHost:localPort hadoop102/172.16.10.xxx:0. Failed on local exception: java.io.IOException: Couldn't setup connection for hdfs/hadoop@EXAMPLE.COM to hadoop102/172.16.10.xxx:8020
原因分析:
原因现在暂时还不太清楚,应该是票据过期的原因,在网上查找了解决方案,大部分都是说是JDK的原因,但是我感觉不太像,而且我也不想去处理JDK
解决方案:
今天我试试环境,发现突然行了,具体如何解决的,暂时还不太清楚,只能记录之前的操作
1.更新所有的principal
modprinc -maxrenewlife 1week admin/admin@EXAMPLE.COM
modprinc -maxrenewlife 1week atlas/hadoop102@EXAMPLE.COM
modprinc -maxrenewlife 1week dn/hadoop102@EXAMPLE.COM
modprinc -maxrenewlife 1week dn/hadoop103@EXAMPLE.COM
modprinc -maxrenewlife 1week dn/hadoop104@EXAMPLE.COM
modprinc -maxrenewlife 1week hbase/hadoop102@EXAMPLE.COM
modprinc -maxrenewlife 1week hbase/hadoop103@EXAMPLE.COM
modprinc -maxrenewlife 1week hbase/hadoop104@EXAMPLE.COM
modprinc -maxrenewlife 1week hdfs/hadoop@EXAMPLE.COM
modprinc -maxrenewlife 1week hive/hadoop102@EXAMPLE.COM
modprinc -maxrenewlife 1week jhs/hadoop102@EXAMPLE.COM
modprinc -maxrenewlife 1week jn/hadoop102@EXAMPLE.COM
modprinc -maxrenewlife 1week jn/hadoop103@EXAMPLE.COM
modprinc -maxrenewlife 1week jn/hadoop104@EXAMPLE.COM
modprinc -maxrenewlife 1week kadmin/admin@EXAMPLE.COM
modprinc -maxrenewlife 1week kadmin/changepw@EXAMPLE.COM
modprinc -maxrenewlife 1week kadmin/hadoop102@EXAMPLE.COM
modprinc -maxrenewlife 1week kafka-client@EXAMPLE.COM
modprinc -maxrenewlife 1week kafka/hadoop102@EXAMPLE.COM
modprinc -maxrenewlife 1week kafka/hadoop103@EXAMPLE.COM
modprinc -maxrenewlife 1week kafka/hadoop104@EXAMPLE.COM
modprinc -maxrenewlife 1week kiprop/hadoop102@EXAMPLE.COM
modprinc -maxrenewlife 1week krbtgt/EXAMPLE.COM@EXAMPLE.COM
modprinc -maxrenewlife 1week maxwell@EXAMPLE.COM
modprinc -maxrenewlife 1week nm/hadoop102@EXAMPLE.COM
modprinc -maxrenewlife 1week nm/hadoop103@EXAMPLE.COM
modprinc -maxrenewlife 1week nm/hadoop104@EXAMPLE.COM
modprinc -maxrenewlife 1week nn/hadoop102@EXAMPLE.COM
modprinc -maxrenewlife 1week nn/hadoop103@EXAMPLE.COM
modprinc -maxrenewlife 1week rangeradmin/hadoop102@EXAMPLE.COM
modprinc -maxrenewlife 1week rangerlookup/hadoop102@EXAMPLE.COM
modprinc -maxrenewlife 1week rangerlookup@EXAMPLE.COM
modprinc -maxrenewlife 1week rangerusersync/hadoop102@EXAMPLE.COM
modprinc -maxrenewlife 1week rm/hadoop102@EXAMPLE.COM
modprinc -maxrenewlife 1week rm/hadoop103@EXAMPLE.COM
modprinc -maxrenewlife 1week rm/hadoop104@EXAMPLE.COM
modprinc -maxrenewlife 1week sarah@EXAMPLE.COM
modprinc -maxrenewlife 1week solu@EXAMPLE.COM
modprinc -maxrenewlife 1week test@EXAMPLE.COM
modprinc -maxrenewlife 1week xwq@EXAMPLE.COM
modprinc -maxrenewlife 1week zookeeper/hadoop102@EXAMPLE.COM
modprinc -maxrenewlife 1week zookeeper/hadoop103@EXAMPLE.COM
modprinc -maxrenewlife 1week zookeeper/hadoop104@EXAMPLE.COM
2.删除hdfs principal和keytab文件
3.重新生成principal和keytab文件
4.使用kinit -kt进行验证