基于CentOS7下的Kerberos集成Hadoop和Hive

目的:新项目针对Hadoop集群以及Hive使用Kerberos做认证处理
版本:
Hadoop 2.7.2
Hive 1.2.1

Kerberos介绍

Kerberos 是一种网络认证协议,其设计目标是通过密钥系统为 客户机 / 服务器 应用程序提供强大的认证服务。该认证过程的实现不依赖于主机操作系统的认证,无需基于主机地址的信任,不要求网络上所有主机的物理安全,并假定网络上传送的数据包可以被任意地读取、修改和插入数据。在以上情况下, Kerberos 作为一种可信任的第三方认证服务,是通过传统的密码技术(如:共享密钥)执行认证服务的。

认证过程具体如下:客户机向认证服务器(AS)发送请求,要求得到某服务器的证书,然后 AS 的响应包含这些用客户端密钥加密的证书。证书的构成为: 1) 服务器 “ticket” ; 2) 一个临时加密密钥(又称为会话密钥 “session key”) 。客户机将 ticket (包括用服务器密钥加密的客户机身份和一份会话密钥的拷贝)传送到服务器上。会话密钥可以(现已经由客户机和服务器共享)用来认证客户机或认证服务器,也可用来为通信双方以后的通讯提供加密服务,或通过交换独立子会话密钥为通信双方提供进一步的通信加密服务

上述认证交换过程需要只读方式访问 Kerberos 数据库。但有时,数据库中的记录必须进行修改,如添加新的规则或改变规则密钥时。修改过程通过客户机和第三方 Kerberos 服务器(Kerberos 管理器 KADM)间的协议完成。有关管理协议在此不作介绍。另外也有一种协议用于维护多份 Kerberos 数据库的拷贝,这可以认为是执行过程中的细节问题,并且会不断改变以适应各种不同数据库技术。

Hadoop提供了两种安全配置simple和kerberos。

simple为简单认证控制用户名和所属组,很容易被冒充。

Kerberos为借助kerberos服务生成秘钥文件所有机器公用,有秘钥文件才是可信节点。本文主要介绍kerberos的配置。Kerberos也存在问题,配置复杂,切更改麻烦,需从新生成秘钥切分发所有机器。

环境依赖:Hadoop+Hive,基于之前博客部署做Kerberos
Hadoop和Hive可参照我之前的博客:
CentOS7 手动部署Hadoop集群2.7.2
CentOS7 手动部署Hive1.2.1

既然环境有了,那下面正式进入部署Kerberos!

环境信息

主机名IP部署内容
test-01172.18.0.1NameNode、SecondaryNameNode、ResourceManager、Hive、master KDC
test-02172.18.0.2NodeManager、DataNode、Kerberos Client
test-03172.18.0.3NodeManager、DataNode、Kerberos Client

注意以下两点:
hostname小写,要不然集成kerberos时会出错
关闭master及各worker防火墙

Kerberos安装部署

1、主节点部署(master KDC)

yum install krb5-server krb5-libs krb5-workstation -y

2、从节点部署(Kerberos Client)

yum install krb5-libs krb5-workstation -y

3、修改配置文件
a.配置kdc.conf(主节点)

[root@test-01 ~]# vim /var/kerberos/krb5kdc/kdc.conf

修改后内容如下,以下配置可直接拿来用

[kdcdefaults]
 kdc_ports = 88
 kdc_tcp_ports = 88

[realms]
# EXAMPLE.COM = {
#  #master_key_type = aes256-cts
#  acl_file = /var/kerberos/krb5kdc/kadm5.acl
#  dict_file = /usr/share/dict/words
#  admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab
#  supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal
# }

 HIVE.COM = {
  #master_key_type = aes256-cts
  acl_file = /var/kerberos/krb5kdc/kadm5.acl
  dict_file = /usr/share/dict/words
  admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab
  max_renewable_life = 7d
  supported_enctypes = aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal
 }

说明:
HIVE.COM:是设定的realms。名字随意。Kerberos可以支持多个realms,一般全用大写
master_key_type,supported_enctypes默认使用aes256-cts。由于,JAVA使用aes256-cts验证方式需要安装额外的jar包,这里暂不使用
acl_file:标注了admin的用户权限。文件格式是
Kerberos_principal permissions [target_principal] [restrictions]支持通配符等
admin_keytab:KDC进行校验的keytab
supported_enctypes:支持的校验方式。注意把aes256-cts去掉

b.配置krb5.conf(主从节点都需要)

[root@test-01 ~]# vim /etc/krb5.conf

修改后内容如下,以下配置可直接拿来用

# Configuration snippets may be placed in this directory as well
includedir /etc/krb5.conf.d/

[logging]
 default = FILE:/var/log/krb5libs.log
 kdc = FILE:/var/log/krb5kdc.log
 admin_server = FILE:/var/log/kadmind.log

[libdefaults]
# dns_lookup_realm = false
# ticket_lifetime = 24h
# renew_lifetime = 7d
# forwardable = true
# rdns = false
# pkinit_anchors = /etc/pki/tls/certs/ca-bundle.crt
## default_realm = EXAMPLE.COM
# default_ccache_name = KEYRING:persistent:%{uid}
 default_realm = HIVE.COM
 dns_lookup_realm = false
 dns_lookup_kdc = false
 ticket_lifetime = 24h
 renew_lifetime = 7d
 forwardable = true
 clockskew = 120
 udp_preference_limit = 1

[realms]
# EXAMPLE.COM = {
#  kdc = kerberos.example.com
#  admin_server = kerberos.example.com
# }
 HIVE.COM = {
  kdc = test-01
  admin_server = test-01
 }

[domain_realm]
# .example.com = EXAMPLE.COM
# example.com = EXAMPLE.COM
 .hive.com = HIVE.COM
 hive.com = HIVE.COM

说明:
[logging]:表示server端的日志的打印位置
udp_preference_limit = 1 禁止使用udp可以防止一个Hadoop中的错误
ticket_lifetime: 表明凭证生效的时限,一般为24小时。
renew_lifetime: 表明凭证最长可以被延期的时限,一般为一个礼拜。当凭证过期之后,对安全认证的服务的后续访问则会失败。
clockskew:时钟偏差是不完全符合主机系统时钟的票据时戳的容差,超过此容差将不接受此票据,单位是秒
修改其中的realm,把默认的EXAMPLE.COM修改为自己要定义的值,如:HIVE.COM。其中,以下参数需要修改:
default_realm:默认的realm。设置为realm。如SUERMAP.COM
kdc:代表要kdc的位置。添加格式是 机器名
admin_server:代表admin的位置。格式是机器名
default_domain:代表默认的域名。(设置master主机所对应的域名,如hive.com)

c.修改database administrator的ACL权限(主节点)

vim /var/kerberos/krb5kdc/kadm5.acl
#修改如下
*/admin@HIVE.COM  *

kadm5.acl 文件更多内容可参考:kadm5.acl文档
想要管理 KDC 的资料库有两种方式, 一种直接在 KDC 本机上面直接执行,可以不需要密码就登入资料库管理;一种则是需要输入账号密码才能管理~这两种方式分别是:
kadmin.local:需要在 KDC server 上面操作,无需密码即可管理资料库
kadmin:可以在任何一台 KDC 领域的系统上面操作,但是需要输入管理员密码

4.配置Kerberos服务相关文件
a.创建Kerberos数据库

创建Kerberos数据库,需要设置管理员密码,创建成功后会在/var/Kerberos/krb5kdc/下生成一系列文件,如果重新创建的话,需要先删除/var/kerberos/krb5kdc下面principal相关文件。

需在主节点的root用户下执行以下命令:

kdb5_util create -s -r HIVE.COM

b.启动kerberos并设置为开启启动

chkconfig krb5kdc on
chkconfig kadmin on
service krb5kdc start
service kadmin start
service krb5kdc status

c.创建 kerberos的管理员
在Master节点的root用户下分别执行以下命令:

kadmin.local
addprinc admin/admin@HIVE.COM

Hadoop集群配置Kerberos

一些概念:
Kerberos principal用于在kerberos加密系统中标记一个唯一的身份。
kerberos为kerberos principal分配tickets使其可以访问由kerberos加密的hadoop服务。
对于hadoop,principals的格式为username/fully.qualified.domain.name@YOUR-REALM.COM.

keytab是包含principals和加密principal key的文件。
keytab文件对于每个host是唯一的,因为key中包含hostname。keytab文件用于不需要人工交互和保存纯文本密码,实现到kerberos上验证一个主机上的principal。
因为服务器上可以访问keytab文件即可以以principal的身份通过kerberos的认证,所以,keytab文件应该被妥善保存,应该只有少数的用户可以访问。

hive配置kerberos的前提是Hadoop集群已经配置好Kerberos,因此我们先来配置Hadoop集群的认证。
1、在主节点与各个节点分别添加以下用户组和用户

#新建用户yarn,其中需设定userID<1000,命令如下:
useradd -u 502 yarn -g hadoop
#  并使用passwd命令为新建用户设置密码
passwd yarn 输入新密码
#用户建好后,用id user命令查看用户信息如下所示
[root@test-01 ~]# id yarn
uid=502(yarn) gid=1002(hadoop) groups=1002(hadoop)

2、创建 kerberos的普通用户及密钥文件,为配置 YARN kerberos security 时,各节点可以相互访问用。
a.在Master节点的root用户下分别执行以下命令:

cd /var/kerberos/krb5kdc/
#登录管理用户
kadmin.local
#创建用户
addprinc -randkey yarn/test-01@HIVE.COM
addprinc -randkey yarn/test-02@HIVE.COM
addprinc -randkey yarn/test-03@HIVE.COM
addprinc -randkey hdfs/test-01@HIVE.COM
addprinc -randkey hdfs/test-02@HIVE.COM
addprinc -randkey hdfs/test-03@HIVE.COM
addprinc -randkey HTTP/test-01@HIVE.COM
addprinc -randkey HTTP/test-02@HIVE.COM
addprinc -randkey HTTP/test-03@HIVE.COM
#生成密钥文件(生成到当前路径下)
kadmin.local -q "xst  -k yarn.keytab  yarn/test-01@HIVE.COM"
kadmin.local -q "xst  -k yarn.keytab  yarn/test-02@HIVE.COM"
kadmin.local -q "xst  -k yarn.keytab  yarn/test-03@HIVE.COM"
 
kadmin.local -q "xst  -k HTTP.keytab  HTTP/test-01@HIVE.COM"
kadmin.local -q "xst  -k HTTP.keytab  HTTP/test-02@HIVE.COM"
kadmin.local -q "xst  -k HTTP.keytab  HTTP/test-03@HIVE.COM"

kadmin.local -q "xst  -k hdfs-unmerged.keytab  HTTP/test-01@HIVE.COM"
kadmin.local -q "xst  -k hdfs-unmerged.keytab  HTTP/test-02@HIVE.COM"
kadmin.local -q "xst  -k hdfs-unmerged.keytab  HTTP/test-03@HIVE.COM"
#合并成一个keytab文件,rkt表示展示,wkt表示写入
$ ktutil
ktutil: rkt hdfs-unmerged.keytab
ktutil: rkt HTTP.keytab
ktutil: rkt yarn.keytab
ktutil: wkt hdfs.keytab
#查看
klist -ket  hdfs.keytab

Keytab name: FILE:hdfs.keytab
KVNO Timestamp           Principal
---- ------------------- ------------------------------------------------------
   2 01/01/2020 22:44:39 hdfs/yc-test-03@HIVE.COM (aes128-cts-hmac-sha1-96) 
   2 01/01/2020 22:44:39 hdfs/yc-test-03@HIVE.COM (des3-cbc-sha1) 
   2 01/01/2020 22:44:39 hdfs/yc-test-03@HIVE.COM (arcfour-hmac) 
   2 01/01/2020 22:44:39 hdfs/yc-test-03@HIVE.COM (camellia256-cts-cmac) 
   2 01/01/2020 22:44:39 hdfs/yc-test-03@HIVE.COM (camellia128-cts-cmac) 
   2 01/01/2020 22:44:39 hdfs/yc-test-03@HIVE.COM (des-hmac-sha1) 
   2 01/01/2020 22:44:39 hdfs/yc-test-03@HIVE.COM (des-cbc-md5) 
   3 01/01/2020 22:44:46 hdfs/yc-test-01@HIVE.COM (aes128-cts-hmac-sha1-96) 
   3 01/01/2020 22:44:46 hdfs/yc-test-01@HIVE.COM (des3-cbc-sha1) 
   3 01/01/2020 22:44:46 hdfs/yc-test-01@HIVE.COM (arcfour-hmac) 
   3 01/01/2020 22:44:46 hdfs/yc-test-01@HIVE.COM (camellia256-cts-cmac) 
   3 01/01/2020 22:44:46 hdfs/yc-test-01@HIVE.COM (camellia128-cts-cmac) 
   3 01/01/2020 22:44:46 hdfs/yc-test-01@HIVE.COM (des-hmac-sha1) 
   3 01/01/2020 22:44:46 hdfs/yc-test-01@HIVE.COM (des-cbc-md5) 
   2 01/01/2020 22:44:50 hdfs/yc-test-02@HIVE.COM (aes128-cts-hmac-sha1-96) 
   2 01/01/2020 22:44:50 hdfs/yc-test-02@HIVE.COM (des3-cbc-sha1) 
   2 01/01/2020 22:44:50 hdfs/yc-test-02@HIVE.COM (arcfour-hmac) 
   2 01/01/2020 22:44:50 hdfs/yc-test-02@HIVE.COM (camellia256-cts-cmac) 
   2 01/01/2020 22:44:50 hdfs/yc-test-02@HIVE.COM (camellia128-cts-cmac) 
   2 01/01/2020 22:44:51 hdfs/yc-test-02@HIVE.COM (des-hmac-sha1) 
   2 01/01/2020 22:44:51 hdfs/yc-test-02@HIVE.COM (des-cbc-md5) 
   3 01/01/2020 22:44:58 yarn/yc-test-02@HIVE.COM (aes128-cts-hmac-sha1-96) 
   3 01/01/2020 22:44:58 yarn/yc-test-02@HIVE.COM (des3-cbc-sha1) 
   3 01/01/2020 22:44:58 yarn/yc-test-02@HIVE.COM (arcfour-hmac) 
   3 01/01/2020 22:44:58 yarn/yc-test-02@HIVE.COM (camellia256-cts-cmac) 
   3 01/01/2020 22:44:58 yarn/yc-test-02@HIVE.COM (camellia128-cts-cmac) 
   3 01/01/2020 22:44:58 yarn/yc-test-02@HIVE.COM (des-hmac-sha1) 
   3 01/01/2020 22:44:58 yarn/yc-test-02@HIVE.COM (des-cbc-md5) 
   3 01/01/2020 22:45:03 yarn/yc-test-01@HIVE.COM (aes128-cts-hmac-sha1-96) 
   3 01/01/2020 22:45:03 yarn/yc-test-01@HIVE.COM (des3-cbc-sha1) 
   3 01/01/2020 22:45:03 yarn/yc-test-01@HIVE.COM (arcfour-hmac) 
   3 01/01/2020 22:45:03 yarn/yc-test-01@HIVE.COM (camellia256-cts-cmac) 
   3 01/01/2020 22:45:03 yarn/yc-test-01@HIVE.COM (camellia128-cts-cmac) 
   3 01/01/2020 22:45:03 yarn/yc-test-01@HIVE.COM (des-hmac-sha1) 
   3 01/01/2020 22:45:03 yarn/yc-test-01@HIVE.COM (des-cbc-md5) 
   3 01/01/2020 22:45:08 yarn/yc-test-03@HIVE.COM (aes128-cts-hmac-sha1-96) 
   3 01/01/2020 22:45:08 yarn/yc-test-03@HIVE.COM (des3-cbc-sha1) 
   3 01/01/2020 22:45:08 yarn/yc-test-03@HIVE.COM (arcfour-hmac) 
   3 01/01/2020 22:45:08 yarn/yc-test-03@HIVE.COM (camellia256-cts-cmac) 
   3 01/01/2020 22:45:08 yarn/yc-test-03@HIVE.COM (camellia128-cts-cmac) 
   3 01/01/2020 22:45:08 yarn/yc-test-03@HIVE.COM (des-hmac-sha1) 
   3 01/01/2020 22:45:08 yarn/yc-test-03@HIVE.COM (des-cbc-md5) 
   2 01/01/2020 22:45:21 HTTP/yc-test-03@HIVE.COM (aes128-cts-hmac-sha1-96) 
   2 01/01/2020 22:45:21 HTTP/yc-test-03@HIVE.COM (des3-cbc-sha1) 
   2 01/01/2020 22:45:21 HTTP/yc-test-03@HIVE.COM (arcfour-hmac) 
   2 01/01/2020 22:45:21 HTTP/yc-test-03@HIVE.COM (camellia256-cts-cmac) 
   2 01/01/2020 22:45:21 HTTP/yc-test-03@HIVE.COM (camellia128-cts-cmac) 
   2 01/01/2020 22:45:21 HTTP/yc-test-03@HIVE.COM (des-hmac-sha1) 
   2 01/01/2020 22:45:21 HTTP/yc-test-03@HIVE.COM (des-cbc-md5) 
   2 01/01/2020 22:45:24 HTTP/yc-test-02@HIVE.COM (aes128-cts-hmac-sha1-96) 
   2 01/01/2020 22:45:24 HTTP/yc-test-02@HIVE.COM (des3-cbc-sha1) 
   2 01/01/2020 22:45:24 HTTP/yc-test-02@HIVE.COM (arcfour-hmac) 
   2 01/01/2020 22:45:24 HTTP/yc-test-02@HIVE.COM (camellia256-cts-cmac) 
   2 01/01/2020 22:45:24 HTTP/yc-test-02@HIVE.COM (camellia128-cts-cmac) 
   2 01/01/2020 22:45:24 HTTP/yc-test-02@HIVE.COM (des-hmac-sha1) 
   2 01/01/2020 22:45:24 HTTP/yc-test-02@HIVE.COM (des-cbc-md5) 
   3 01/01/2020 22:45:28 HTTP/yc-test-01@HIVE.COM (aes128-cts-hmac-sha1-96) 
   3 01/01/2020 22:45:28 HTTP/yc-test-01@HIVE.COM (des3-cbc-sha1) 
   3 01/01/2020 22:45:28 HTTP/yc-test-01@HIVE.COM (arcfour-hmac) 
   3 01/01/2020 22:45:28 HTTP/yc-test-01@HIVE.COM (camellia256-cts-cmac) 
   3 01/01/2020 22:45:28 HTTP/yc-test-01@HIVE.COM (camellia128-cts-cmac) 
   3 01/01/2020 22:45:28 HTTP/yc-test-01@HIVE.COM (des-hmac-sha1) 
   3 01/01/2020 22:45:28 HTTP/yc-test-01@HIVE.COM (des-cbc-md5) 
   2 01/01/2020 22:45:58 mapred/yc-test-01@HIVE.COM (aes128-cts-hmac-sha1-96) 
   2 01/01/2020 22:45:58 mapred/yc-test-01@HIVE.COM (des3-cbc-sha1) 
   2 01/01/2020 22:45:58 mapred/yc-test-01@HIVE.COM (arcfour-hmac) 
   2 01/01/2020 22:45:58 mapred/yc-test-01@HIVE.COM (camellia256-cts-cmac) 
   2 01/01/2020 22:45:58 mapred/yc-test-01@HIVE.COM (camellia128-cts-cmac) 
   2 01/01/2020 22:45:58 mapred/yc-test-01@HIVE.COM (des-hmac-sha1) 
   2 01/01/2020 22:45:58 mapred/yc-test-01@HIVE.COM (des-cbc-md5) 
   2 01/01/2020 22:46:03 mapred/yc-test-02@HIVE.COM (aes128-cts-hmac-sha1-96) 
   2 01/01/2020 22:46:03 mapred/yc-test-02@HIVE.COM (des3-cbc-sha1) 
   2 01/01/2020 22:46:03 mapred/yc-test-02@HIVE.COM (arcfour-hmac) 
   2 01/01/2020 22:46:03 mapred/yc-test-02@HIVE.COM (camellia256-cts-cmac) 
   2 01/01/2020 22:46:03 mapred/yc-test-02@HIVE.COM (camellia128-cts-cmac) 
   2 01/01/2020 22:46:03 mapred/yc-test-02@HIVE.COM (des-hmac-sha1) 
   2 01/01/2020 22:46:03 mapred/yc-test-02@HIVE.COM (des-cbc-md5) 
   2 01/01/2020 22:46:08 mapred/yc-test-03@HIVE.COM (aes128-cts-hmac-sha1-96) 
   2 01/01/2020 22:46:08 mapred/yc-test-03@HIVE.COM (des3-cbc-sha1) 
   2 01/01/2020 22:46:08 mapred/yc-test-03@HIVE.COM (arcfour-hmac) 
   2 01/01/2020 22:46:08 mapred/yc-test-03@HIVE.COM (camellia256-cts-cmac) 
   2 01/01/2020 22:46:08 mapred/yc-test-03@HIVE.COM (camellia128-cts-cmac) 
   2 01/01/2020 22:46:08 mapred/yc-test-03@HIVE.COM (des-hmac-sha1) 
   2 01/01/2020 22:46:08 mapred/yc-test-03@HIVE.COM (des-cbc-md5) 

将生成的hdfs.keytab文件复制到hadoop配置路径下,并授权
后面经常会遇到使用keytab login失败的问题,首先需要检查的就是文件的权限。

cp hdfs.keytab /opt/hadoop-2.7.2/etc/hadoop
cd /opt/hadoop-2.7.2/etc/hadoop
chown hdfs:hadoop hdfs.keytab && chmod 400 hdfs.keytab

3、修改Hadoop集群配置
a.停止集群

cd /opt/hadoop-2.7.2/sbin
./stop-all.sh

b.修改core-site.xml

[root@test-01 /opt/hadoop-2.7.2/etc/hadoop]# vim core-site.xml
<!--添加以下配置-->
    <property>
        <name>hadoop.security.authorization</name>
        <value>true</value>
    </property>
    <property>
        <name>hadoop.security.authentication</name>
        <value>kerberos</value>
    </property>

c.修改yarn-site.xml

[root@test-01 /opt/hadoop-2.7.2/etc/hadoop]# vim yarn-site.xml
<!--添加以下内容-->
<property>
      <name>yarn.nodemanager.resource.memory-mb</name>
      <value>4096</value>
</property>

  <!-- ResourceManager security configs -->
  <property>
    <name>yarn.resourcemanager.keytab</name>
    <value>/opt/hadoop-2.7.2/etc/hadoop/hdfs.keytab</value> <!-- path to the YARN keytab -->
  </property>
  <property>
    <name>yarn.resourcemanager.principal</name>
    <value>hdfs/_HOST@HIVE.COM</value>
  </property>
  <!-- NodeManager security configs -->
  <property>
    <name>yarn.nodemanager.keytab</name>
    <value>/opt/hadoop-2.7.2/etc/hadoop/hdfs.keytab</value> <!-- path to the YARN keytab -->
  </property>
  <property>
    <name>yarn.nodemanager.principal</name>
    <value>hdfs/_HOST@HIVE.COM</value>
  </property>
  <property>
    <name>yarn.nodemanager.container-executor.class</name>
    <value>org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor</value>
  </property>
  <property>
    <name>yarn.nodemanager.linux-container-executor.group</name>
    <value>yarn</value>
  </property>
  <property>
    <name>yarn.resourcemanager.proxy-user-privileges.enabled</name>
    <value>true</value>
  </property>

d.修改hdfs-site.xml

[root@test-01 /opt/hadoop-2.7.2/etc/hadoop]# vim hdfs-site.xml
<!--添加以下内容-->
<property>
  <name>dfs.block.access.token.enable</name>
  <value>true</value>
</property>
<property>  
  <name>dfs.datanode.data.dir.perm</name>  
  <value>700</value>  
</property>
<property>
  <name>dfs.namenode.keytab.file</name>
  <value>/opt/hadoop-2.7.2/etc/hadoop/hdfs.keytab</value>
</property>
<property>
  <name>dfs.namenode.kerberos.principal</name>
  <value>hdfs/_HOST@HIVE.COM</value>
</property>
<property>
  <name>dfs.namenode.kerberos.https.principal</name>
  <value>HTTP/_HOST@HIVE.COM</value>
</property>
<property>
  <name>dfs.datanode.address</name>
  <value>0.0.0.0:1004</value>
</property>
<property>
  <name>dfs.datanode.http.address</name>
  <value>0.0.0.0:1006</value>
</property>
<property>
  <name>dfs.datanode.keytab.file</name>
  <value>/opt/hadoop-2.7.2/etc/hadoop/hdfs.keytab</value>
</property>
<property>
  <name>dfs.datanode.kerberos.principal</name>
  <value>hdfs/_HOST@HIVE.COM</value>
</property>
<property>
  <name>dfs.datanode.kerberos.https.principal</name>
  <value>HTTP/_HOST@HIVE.COM</value>
</property>
<property>
  <name>dfs.block.access.token.enable</name>
  <value>true</value>
</property>
<property>  
  <name>dfs.datanode.data.dir.perm</name>  
  <value>700</value>  
</property>
<property>
  <name>dfs.namenode.keytab.file</name>
  <value>/opt/hadoop-2.7.2/etc/hadoop/hdfs.keytab</value>
</property>
<property>
  <name>dfs.namenode.kerberos.principal</name>
  <value>hdfs/_HOST@HIVE.COM</value>
</property>
<property>
  <name>dfs.namenode.kerberos.https.principal</name>
  <value>HTTP/_HOST@HIVE.COM</value>
</property>
<property>
  <name>dfs.datanode.address</name>
  <value>0.0.0.0:1004</value>
</property>
<property>
  <name>dfs.datanode.http.address</name>
  <value>0.0.0.0:1006</value>
</property>
<property>
  <name>dfs.datanode.keytab.file</name>
  <value>/opt/hadoop-2.7.2/etc/hadoop/hdfs.keytab</value>
</property>
<property>
  <name>dfs.datanode.kerberos.principal</name>
  <value>hdfs/_HOST@HIVE.COM</value>
</property>
<property>
  <name>dfs.datanode.kerberos.https.principal</name>
  <value>HTTP/_HOST@HIVE.COM</value>
</property>

<property>
  <name>dfs.webhdfs.enabled</name>
  <value>true</value>
</property>
 
<property>
  <name>dfs.web.authentication.kerberos.principal</name>
  <value>HTTP/_HOST@HIVE.COM</value>
</property>
 
<property>
  <name>dfs.web.authentication.kerberos.keytab</name>
  <value>/opt/hadoop-2.7.2/etc/hadoop/hdfs.keytab</value>
</property>

<property>

<name>dfs.secondary.namenode.keytab.file</name>

<value>/opt/hadoop-2.7.2/etc/hadoop/hdfs.keytab</value>

</property>

<property>

<name>dfs.secondary.namenode.kerberos.principal</name>

<value>hdfs/_HOST@HIVE.COM</value>

</property>

<property>
  <name>hadoop.tmp.dir</name>
  <value>/opt/hadoop-2.7.2/current/tmp</value>
  <description>A base for other temporary directories.</description>
</property>

e.修改container-executor.cfg

[root@test-01 /opt/hadoop-2.7.2/etc/hadoop]# vim container-executor.cfg
<!--添加以下内容-->
yarn.nodemanager.linux-container-executor.group=hadoop

#configured value of yarn.nodemanager.linux-container-executor.group

banned.users=hdfs

#comma separated list of users who can not run applications

min.user.id=0

#Prevent other super-users

allowed.system.users=root,yarn,hdfs,mapred,nobody

##comma separated list of system users who CAN run applications

4、编译安装JSVC

当设置了安全的datanode时,启动datanode需要root权限,需要修改hadoop-env.sh文件.且需要安装jsvc,同时重新下载编译包commons-daemon-1.0.15.jar,并把$HADOOP_HOME/share/hadoop/hdfs/lib下替换掉.
否则报错Cannot start secure DataNode without configuring either privileged resources

例如:启动DataNode时会发生以下错误:

2020-01-05 17:20:15,261 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
java.lang.RuntimeException: Cannot start secure DataNode without configuring either privileged resources or SASL RPC data transfer protection and SSL for HTTP.  Using privileged resources in combination with SASL RPC data transfer protection is not supported.
        at org.apache.hadoop.hdfs.server.datanode.DataNode.checkSecureConfig(DataNode.java:1173)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1073)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:428)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2373)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2260)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2307)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2484)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2508)
2020-01-05 17:20:15,264 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2020-01-05 17:20:15,265 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:

a.下载安装包

下载解压commons-daemon-1.2.2-src.tar.gz及commons-daemon-1.2.2-bin.tar.gz

操作命令:

[root@test-01 /opt/hadoop-2.7.2/etc/hadoop]# cd /opt
[root@test-01 /opt]# mkdir JSVC_packages
[root@test-01 /opt]# cd JSVC_packages/
[root@test-01 /opt/JSVC_packages]# wget http://apache.fayea.com//commons/daemon/source/commons-daemon-1.2.2-src.tar.gz
[root@test-01 /opt/JSVC_packages]# wget http://apache.fayea.com//commons/daemon/binaries/commons-daemon-1.2.2-bin.tar.gz
[root@test-01 /opt/JSVC_packages]# tar xf commons-daemon-1.2.2-bin.tar.gz
[root@test-01 /opt/JSVC_packages]# tar xf commons-daemon-1.2.2-src.tar.gz
[root@test-01 /opt/JSVC_packages]# ll
total 480
drwxr-xr-x. 3 root root   4096 Jan  2 10:26 commons-daemon-1.2.2
-rw-r--r--. 1 root root 179626 Oct  4 05:14 commons-daemon-1.2.2-bin.tar.gz
drwxr-xr-x. 3 root root   4096 Jan  2 10:25 commons-daemon-1.2.2-src
-rw-r--r--. 1 root root 301538 Oct  4 05:14 commons-daemon-1.2.2-src.tar.gz
#编译生成jsvc,并拷贝至指定目录
[root@test-01 /opt/JSVC_packages]# cd commons-daemon-1.2.2-src/src/native/unix/ 
[root@test-01 /opt/JSVC_packages/commons-daemon-1.2.2-src/src/native/unix]# ./configure
[root@test-01 /opt/JSVC_packages/commons-daemon-1.2.2-src/src/native/unix]# make
[root@test-01 /opt/JSVC_packages/commons-daemon-1.2.2-src/src/native/unix]# cp jsvc /opt/hadoop-2.7.2/libexec/ 
#拷贝commons-daemon-1.2.2.jar
[root@test-01 /opt/JSVC_packages/commons-daemon-1.2.2-src/src/native/unix]# cd /opt/JSVC_packages/commons-daemon-1.2.2/
[root@test-01 /opt/JSVC_packages/commons-daemon-1.2.2]# cp /opt/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar /opt/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar.bak
[root@test-01 /opt/JSVC_packages/commons-daemon-1.2.2]# cp commons-daemon-1.2.2.jar /opt/hadoop-2.7.2/share/hadoop/hdfs/lib/
[root@test-01 /opt/JSVC_packages/commons-daemon-1.2.2]# cd /opt/hadoop-2.7.2/share/hadoop/hdfs/lib/
[root@test-01 /opt/hadoop-2.7.2/share/hadoop/hdfs/lib]# chown hdfs:hadoop commons-daemon-1.2.2.jar 

b.修改hadoop-env.sh文件

[root@test-01 /opt/hadoop-2.7.2/share/hadoop/hdfs/lib]# cd /opt/hadoop-2.7.2/etc/hadoop/
[root@test-01 /opt/hadoop-2.7.2/etc/hadoop]# vim hadoop-env.sh 
#在最底下添加以下内容:
export HADOOP_SECURE_DN_USER=hdfs
export JSVC_HOME=/opt/hadoop-2.7.2/libexec

5、启动开启krb认证的yarn集群和开启krb认证的hadoop集群
首先将hadoop部署包重新发送到各个节点:

rsync -av /opt/hadoop-2.7.2 root@test-02:/opt
rsync -av /opt/hadoop-2.7.2 root@test-03:/opt

在主节点下的HADOOP安装目录下执行以下命令:
a.在用户hdfs下的格式化集群(仅第一次安装时或者修改了hadoop相关设置后需要执行该步,一般不用)

[hdfs@test-01 /opt/hadoop-2.7.2/bin]$  ./hadoop namenode -format

b.启动集群

[hdfs@yc-test-01 /opt/hadoop-2.7.2/bin]$ cd /opt/hadoop-2.7.2/sbin
#登录认证用户
[hdfs@yc-test-01 /opt/hadoop-2.7.2/sbin]$ kinit -k -t /opt/hadoop-2.7.2/etc/hadoop/hdfs.keytab hdfs/yc-test-01@HIVE.COM
#确认认证信息
[hdfs@yc-test-01 /opt/hadoop-2.7.2/sbin]$ klist
Ticket cache: FILE:/tmp/krb5cc_501
Default principal: hdfs/yc-test-01@HIVE.COM

Valid starting       Expires              Service principal
01/05/2020 10:26:07  01/06/2020 10:26:07  krbtgt/HIVE.COM@HIVE.COM
        renew until 01/11/2020 15:14:07
#启动hdfs
[hdfs@test-01 /opt/hadoop-2.7.2/sbin]$ ./start-dfs.sh 
#启动DataNode,需要切换到root用户
[root@test-01 /opt/hadoop-2.7.2/sbin]# ./start-secure-dns.sh
#启动yarn,需要再次切换到hdfs用户
[hdfs@test-01 /opt/hadoop-2.7.2/sbin]$ ./start-yarn.sh 

c.如需停止集群

#停止yarn,需要切换到hdfs用户
[hdfs@test-01 /opt/hadoop-2.7.2/sbin]$ ./stop-yarn.sh
#停止DataNode,需要切换到root用户
[root@test-01 /opt/hadoop-2.7.2/sbin]# ./stop-secure-dns.sh
#启动hdfs,需要再次切换到hdfs用户
[hdfs@test-01 /opt/hadoop-2.7.2/sbin]$ ./stop-dfs.sh 

d.验证yarn集群和hadoop集群是否成功启动

yarn集群:访问主节点IP:8088
hadoop集群: 访问主节点IP:50070

在这里插入图片描述
在这里插入图片描述

Hive配置Kerberos

1、创建hive用户

#新建用户hive,命令如下:
useradd -u 503 hive-g hadoop
#  并使用passwd命令为新建用户设置密码
passwd hive 输入新密码
#用户建好后,用id user命令查看用户信息如下所示
[root@test-01 ~]# id hive
uid=503(hive) gid=1002(hadoop) groups=1002(hadoop)

2、生成 keytab
在主节点,即KDC server 节点上执行下面命令(root用户):

cd /var/kerberos/krb5kdc/
kadmin.local -q "addprinc -randkey hive/test-01@HIVE.COM "
kadmin.local -q "xst  -k hive.keytab  hive/test-01@HIVE.COM "
#查看
[root@test-01 /var/kerberos/krb5kdc]# klist -ket  hive.keytab
Keytab name: FILE:hive.keytab
KVNO Timestamp           Principal
---- ------------------- ------------------------------------------------------
   1 01/02/2020 11:29:44 hive/yc-test-01@HIVE.COM (aes128-cts-hmac-sha1-96) 
   1 01/02/2020 11:29:44 hive/yc-test-01@HIVE.COM (des3-cbc-sha1) 
   1 01/02/2020 11:29:44 hive/yc-test-01@HIVE.COM (arcfour-hmac) 
   1 01/02/2020 11:29:44 hive/yc-test-01@HIVE.COM (camellia256-cts-cmac) 
   1 01/02/2020 11:29:44 hive/yc-test-01@HIVE.COM (camellia128-cts-cmac) 
   1 01/02/2020 11:29:44 hive/yc-test-01@HIVE.COM (des-hmac-sha1) 
   1 01/02/2020 11:29:44 hive/yc-test-01@HIVE.COM (des-cbc-md5) 
#将hive.keytab发送到hive目录的配置文件下:
[root@test-01 /var/kerberos/krb5kdc]# cp hive.keytab /opt/apache-hive-1.2.1-bin/conf
#授权
[root@test-01 /var/kerberos/krb5kdc]# cd /opt/apache-hive-1.2.1-bin/conf
[root@test-01 /opt/apache-hive-1.2.1-bin/conf]# chown hive:hadoop hive.keytab && chmod 400 hive.keytab

由于 keytab 相当于有了永久凭证,不需要提供密码(如果修改 kdc 中的 principal 的密码,则该 keytab 就会失效),所以其他用户如果对该文件有读权限,就可以冒充 keytab 中指定的用户身份访问 hadoop,所以 keytab 文件需要确保只对 owner 有读权限(0400)

3、修改配置文件
a.配置hive-site.xml

[root@test-01 /opt/apache-hive-1.2.1-bin/conf]# vim hive-site.xml
<!---添加以下内容->
<property>
    <name>hive.server2.authentication</name>
    <value>KERBEROS</value>
  </property>
  <property>
    <name>hive.server2.authentication.kerberos.principal</name>
    <value>hive/_HOST@HIVE.COM</value>
  </property>
<property>
  <name>hive.server2.authentication.kerberos.keytab</name>
  <value>/opt/apache-hive-1.2.1-bin/conf/hive.keytab</value>
</property>

<property>
  <name>hive.metastore.sasl.enabled</name>
  <value>true</value>
</property>
<property>
  <name>hive.metastore.kerberos.keytab.file</name>
  <value>/opt/apache-hive-1.2.1-bin/conf/hive.keytab</value>
</property>
<property>
  <name>hive.metastore.kerberos.principal</name>
  <value>hive/_HOST@HIVE.COM</value>
</property>

b.配置core-site.xml

[root@test-01 /opt/apache-hive-1.2.1-bin/conf]# cd /opt/hadoop-2.7.2/etc/hadoop/
[root@test-01 /opt/hadoop-2.7.2/etc/hadoop]# vim core-site.xml 
<!--添加以下配置-->
    <property>
        <name>hadoop.proxyuser.hive.hosts</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.hive.groups</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.hdfs.hosts</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.hdfs.groups</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.HTTP.hosts</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.HTTP.groups</name>
        <value>*</value>
    </property>

4、启动hive

[root@test-01 /opt/hadoop-2.7.2/etc/hadoop]# cd /opt/apache-hive-1.2.1-bin/bin/
[root@test-01 /opt/apache-hive-1.2.1-bin/bin]# su hive
[hive@test-01 /opt/apache-hive-1.2.1-bin/bin]$ nohup hive --service metastore >> metastore.log 2>&1 &
[hive@test-01 /opt/apache-hive-1.2.1-bin/bin]$ nohup  hive --service hiveserver2 >> hiveserver2.log 2>&1 &

5、连接测试
a.hive连接

[hive@yc-test-01 /opt/apache-hive-1.2.1-bin/bin]$ hive

Logging initialized using configuration in file:/opt/apache-hive-1.2.1-bin/conf/hive-log4j.properties
hive> 

b.beeline连接

配置kerberos后,每次窗口连接都要登录:kinit -k -t /opt/apache-hive-1.2.1-bin/conf/hive.keytab hive/yc-test-01@HIVE.COM

[hive@test-01 /opt/apache-hive-1.2.1-bin/bin]$ kinit -k -t /opt/apache-hive-1.2.1-bin/conf/hive.keytab hive/yc-test-01@HIVE.COM
[hive@test-01 /opt/apache-hive-1.2.1-bin/bin]$ beeline 
Beeline version 1.2.1 by Apache Hive
beeline> !connect jdbc:hive2://yc-test-01:10000/default;principal=hive/yc-test-01@HIVE.COM
Connecting to jdbc:hive2://yc-test-01:10000/default;principal=hive/yc-test-01@HIVE.COM
Enter username for jdbc:hive2://yc-test-01:10000/default;principal=hive/yc-test-01@HIVE.COM: hive
Enter password for jdbc:hive2://yc-test-01:10000/default;principal=hive/yc-test-01@HIVE.COM: ******
Connected to: Apache Hive (version 1.2.1)
Driver: Hive JDBC (version 1.2.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://yc-test-01:10000/default> 

这里登录的用户名和密码是最开始创建hive的时候的所用的 hive的用户名和密码,本次测试的用户名和密码为:hive/yctest

至此,hive的kerberos认证配置完成!

FAQ

报错一:
在匹配完hdfs的keerberos后,启动hdfs不成功,报如下错误:

2020-01-03 13:47:37,961 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:192)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:425)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:472)
Caused by: java.io.IOException: Linux container executor not configured properly (error=24)
	at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:175)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:190)
	... 3 more

Caused by: ExitCodeException exitCode=24: File /opt/hadoop-2.7.2/etc/hadoop must be owned by root, but is owned by 500
 
	at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
	at org.apache.hadoop.util.Shell.run(Shell.java:455)
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
	at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:169)

原因:这个是在${HADOOP_HOME}/etc/hadoop/container-executor.cfg配置文件的权限,和他上层目录的user(所属者)必须是root。

解决方式:chown root:hadoop ${HADOOP_HOME}/etc/hadoop (我就改了etc的权限为root:hadoop,或者root:root 应该也没问题)

然后我发现,报了这样的错!

Caused by: ExitCodeException exitCode=24: File /opt/hadoop-2.7.2 must be owned by root, but is owned by 500

或者这样

ExitCodeException exitCode=24: File /opt/hadoop-2.7.2 must be owned by root, but is owned by 500

然后我就明白了,应该让这个container-executor.cfg配置文件的所有上层目录都变成root,就是说opt,hadoop-2.7.2,etc,hadoop这几个目录都要变成root:hadoop 或者 root:root(注意:其他的不用),问题解决!
报错二:
上述问题解决后,可能会出现下面的错误:


2020-01-03 14:36:21,416 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:192)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:425)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:472)
Caused by: java.io.IOException: Linux container executor not configured properly (error=22)
	at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:175)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:190)
	... 3 more
Caused by: ExitCodeException exitCode=22: Invalid permissions on container-executor binary.
 
	at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
	at org.apache.hadoop.util.Shell.run(Shell.java:455)
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
	at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:169)
	... 4 more

原因:看起来是执行某一个脚本或者文件的时候,权限有问题!

解决:其实就是在${HADOOP_HOME}/bin下面的container-executor这个的权限有问题

执行:

chmod 6050 container-executor

这时候有可能出现另一种报错:


2020-01-03 11:43:09,610 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:192)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:425)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:472)
Caused by: java.io.IOException: Cannot run program "/opt/hadoop-2.7.2/bin/container-executor": error=13, Permission denied
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
	at org.apache.hadoop.util.Shell.runCommand(Shell.java:485)
	at org.apache.hadoop.util.Shell.run(Shell.java:455)
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
	at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:169)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:190)
	... 3 more
Caused by: java.io.IOException: error=13, Permission denied
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:186)
	at java.lang.ProcessImpl.start(ProcessImpl.java:130)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
	... 8 more

这个报错是这样的:
在etc/hadoop/container-executor.cfg这个配置文件里面写了:

yarn.nodemanager.linux-container-executor.group=hadoop
#configured value of yarn.nodemanager.linux-container-executor.group
banned.users=hdfs
#comma separated list of users who can not run applications
min.user.id=0
#Prevent other super-users
allowed.system.users=root,yarn,hdfs,mapred,nobody
##comma separated list of system users who CAN run applications

所以${HADOOP_HOME}/bin下面的container-executor的权限应该如下:

chown root:hadoop container-executor
chmod 6050 container-executor

如下图这个样子:
在这里插入图片描述
报错三:
启动SecondaryNameNode时候,报错

关键字:Inconsistent checkpoint fields,报错信息类似下面内容:

2020-01-03 11:42:18,189 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Log Size Trigger    :1000000 txns
2020-01-03 11:43:18,365 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint
java.io.IOException: Inconsistent checkpoint fields.
LV = -56 namespaceID = 1384221685 cTime = 0 ; clusterId = CID-319b9698-c88d-4fe2-8cb2-c4f440f690d4 ; blockpoolId = BP-1627258458-172.25.40.171-1397735061985.
Expecting respectively: -56; 476845826; 0; CID-50401d89-a33e-47bf-9d14-914d8f1c4862; BP-2131387753-172.25.40.171-1397730036484.
        at org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:135)
        at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:518)
        at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:383)
        at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$1.run(SecondaryNameNode.java:349)
        at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
        at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:345)
        at java.lang.Thread.run(Thread.java:744)

该错误原因,可能是因为没有设置好SecondaryNameNode上core-site.xml文件中的“hadoop.tmp.dir”。
另外,也请配置好SecondaryNameNode上hdfs-site.xml中的“dfs.datanode.data.dir”为合适的值:

<property>
  <name>hadoop.tmp.dir</name>
  <value>/opt/hadoop-2.7.2/current/tmp</value>
  <description>A base for other temporary directories.</description>
</property>

报错四:
SecondaryNameNode启动后报错:

2020-01-03  18:29:55,438 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: registered UNIX signal handlers for [TERM, HUP, INT]
2020-01-03  18:29:55,643 FATAL org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Failed to start secondary namenode
java.io.IOException: Running in secure mode, but config doesn't have a keytab
at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:306)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:219)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init>(SecondaryNameNode.java:194)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:690)
2020-01-03 18:29:55,646 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1: ExitException
2020-01-03  18:29:55,648 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: SHUTDOWN_MSG:

解决方法:

#在hdfs-site.xml 里配置一下内容
<property>
<name>dfs.secondary.namenode.keytab.file</name>
<value>/opt/hadoop-2.7.2/etc/hadoop/hdfs.keytab</value>
</property>
<property>
<name>dfs.secondary.namenode.kerberos.principal</name>
<value>hdfs/_HOST@SHUCHIDT.COM</value>
</property>

报错五:
操作hadoop的时候有如下错误:

Permission denied: user=root, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x 

无论是用sudo hadoop dfs -mkdir 建立文件 还是 put文件,都会显示,同样的错误!!
其实是这样的,**/这是文件的所有者是hdfs 权限为755 也就是只有hdfs才能对这个文件进行sudo的操作*

那么接下来我们便可以这样操作文件,可以以hdfs的身份对文件进行操作

sudo -u hdfs hadoop fs -mkdir /user/root

或者直接切换成hdfs用户,问题同样解决

参考文档:

https://www.jianshu.com/p/f84c3668272b
http://support.supermap.com.cn/DataWarehouse/WebDocHelp/iServer/server_service_management/spark_cluster/yarn_kerberose_using.htm
https://www.cnblogs.com/garfieldcgf/p/10077331.html
https://blog.csdn.net/wankunde/article/details/77503450
https://blog.csdn.net/qq_27499099/article/details/77771253
https://blog.csdn.net/qq_21383435/article/details/83685815
https://www.cnblogs.com/yjt1993/p/11769553.html
https://blog.csdn.net/weixin_34249678/article/details/93288454
https://blog.csdn.net/aquester/article/details/24097287
https://blog.csdn.net/lsr40/article/details/79554901
HADOOP3 部署踩过的坑
kerberos-hadoop配置常见问题汇总
https://www.cnblogs.com/a72hongjie/articles/8990629.html
hive使用beeline连接遇到的问题
Kerberos常用命令

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值