hdfs 多租户_CDH多租户配置过程中遇到的问题

本文分享了在CDH环境下配置HDFS多租户时遇到的服务启动错误和权限控制问题。启动错误源于环境变量配置导致HDFS使用SIMPLE认证而非Kerberos。在集成Kerberos和Sentry后,发现Sentry对Hive CLI权限控制无效,而Beeline正常。此外,还讨论了MapReduce作业提交限制、YARN的min.user.id配置以及Impala启动时的SASL错误的解决方法。
摘要由CSDN通过智能技术生成

多租户是CDH里面非常重要的一部分,从一开始配置KDC到集成KDC,服务使用过程中都有可能会遇到各种各样的问题;下面我举例说下我当时遇过的问题,希望能帮助到大家

服务启动错误

KDC服务配置完成安装完成,CDH集成过程中也没问题,CDH启动过程完客户端执行kinit的时候也没有问题,但一旦用hadoop fs -/s hadoop命令就报以下错误

SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]

百思不得其解的情况下去看了namenode启动脚本,发现里面会首先加载环境变量为 $HADOOP_CONF_DIR里面的配置文件.xml

接着输入命令 echo $HADOOP_CONF_DIR,发现有值

vi /etc/profile发现以下配置

企业微信截图_15281173461421.png

大坑啊~~~~,hdfs启动一直加载这个目录下的配置文件,而不是加载cdh前端生成的配置文件,导致hadoop fs 命令一直发送simple请求而不是kerberos请求

服务使用问题

CDH集成Kerberos + Sentry后,由于部分用户有权限登陆linux直接使用服务,偶尔他们会用hive client使用hive服务,这时候你会发现 sentry权限控制对hive client不生效

kinit deng_yb,该账号之前做了权限控制

登陆hive clinet看到的效果

[root@bi-master ~]# klist

Ticket cache: FILE:/tmp/krb5cc_0

Default principal: deng_yb@WONHIGH.COM

Valid starting Expires Service principal

06/07/18 20:40:52 06/08/18 20:40:52 krbtgt/WONHIGH.COM@WONHIGH.COM

renew until 06/14/18 20:40:52

[root@bi-master ~]# hive

Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0

Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is deprecated and will likely be removed in a future release

Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0

Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/jars/hive-common-1.1.0-cdh5.11.0.jar!/hive-log4j.properties

sWARNING: Hive CLI is deprecated and migration to Beeline is recommended.

hive> show databases;

OK

bi

default

gms

gtp

gtp_data

gtp_dc

gtp_test

gtp_txt

kudu_raw

kudu_test

kudu_vip

Time taken: 3.417 seconds, Fetched: 11 row(s)

所有库的信息都看到了~~~~~

同样账号。在beeline看到的是

Last login: Thu Jun 7 21:48:31 2018 from 10.230.71.245

[root@bi-master ~]# klist

Ticket cache: FILE:/tmp/krb5cc_0

Default principal: deng_yb@WONHIGH.COM

Valid starting Expires Service principal

06/07/18 20:40:52 06/08/18 20:40:52 krbtgt/WONHIGH.COM@WONHIGH.COM

renew until 06/14/18 20:40:52

[root@bi-master ~]# beeline

Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0

Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is deprecated and will likely be removed in a future release

Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0

Beeline version 1.1.0-cdh5.11.0 by Apache Hive

beeline> !connect jdbc:hive2://localhost:10000/;principal=hive/bi-master@WONHIGH.COM

scan complete in 9ms

Connecting to jdbc:hive2://localhost:10000/;principal=hive/bi-master@WONHIGH.COM

Connected to: Apache Hive (version 1.1.0-cdh5.11.0)

Driver: Hive JDBC (version 1.1.0-cdh5.11.0)

Transaction isolation: TRANSACTION_REPEATABLE_READ

0: jdbc:hive2://localhost:10000/> show databases;

INFO : Compiling command(queryId=hive_20180607220303_1319b1e8-5ec3-477b-836e-2a279b566ef4): show databases

INFO : Semantic Analysis Completed

INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from deserializer)], properties:null)

INFO : Completed compiling command(queryId=hive_20180607220303_1319b1e8-5ec3-477b-836e-2a279b566ef4); Time taken: 1.87 seconds

INFO : Executing command(queryId=hive_20180607220303_1319b1e8-5ec3-477b-836e-2a279b566ef4): show databases

INFO : Starting task [Stage-0:DDL] in serial mode

INFO : Completed executing command(queryId=hive_20180607220303_1319b1e8-5ec3-477b-836e-2a279b566ef4); Time taken: 0.835 seconds

INFO : OK

+----------------+--+

| database_name |

+----------------+--+

| bi |

| default |

+----------------+--+

3 rows selected (4.704 seconds

只能看到部分库的信息

因此hive client看到的东西不受sentry控制

但之前我们通过hadoop fs -ls 命令是看不到其他用户目录的下的文件,是否意味着就算通过在hive clinet看到所有东西(metainfo),超出自身权限的数据(data)是看不到的?

#随便查个不是自己权限下的表信息

Time taken: 0.076 seconds, Fetched: 114 row(s)

hive> select * from ods_item;

FAILED: SemanticException Unable to determine if hdfs://bi-master:8020/user/hive/warehouse/gtp.db/ods_item is encrypted: org.apache.hadoop.security.AccessControlException: Permission denied: user=deng_yb, access=READ, inode="/user/hive/warehouse/gtp.db/ods_item":hive:hive:drwxrwx--x

at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkAccessAcl(DefaultAuthorizationProvider.java:363)

at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:256)

at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:168)

at org.apache.sentry.hdfs.SentryAuthorizationProvider.checkPermission(SentryAuthorizationProvider.java:178)

at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:152)

at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3529)

at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:3512)

at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPathAccess(FSDirectory.java:3483)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6588)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getEZForPath(FSNamesystem.java:9282)

at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEZForPath(NameNodeRpcServer.java:1635)

at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getEZForPath(AuthorizationProviderProxyClientProtocol.java:928)

at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEZForPath(ClientNamenodeProtocolServerSideTranslatorPB.java:1360)

at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2220)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2214)

因此实际上,真实数据是看不到的,假如你不介意的话;就这样也没关系

介意的话,可以在cm上面hive配置设置白名单用户

hive_allowed.png

这样其他用户想查看库和表信息就会报错

MapReduce使用问题

我们在用sqoop从oracle导数到hive过程中,报类似这样的错

Requested user deng_yb is not whitelisted and has id 501,whichis below the minimum allowed 1000

Failing this attempt. Failing the application.

17/09/02 20:05:04 INFO mapreduce.Job: Counters: 0

Job Finished in 6.184 seconds

那是因为Yarn限制了用户id小于10000的用户提交作业;Yarn的min.user.id改为0即可

to_zero.png

重启yarn

部分用户想用hdfs账号做MapReduce操作,有可能遇到下面错误

Diagnostics: Application application_1528344974377_0009 initialization failed (exitCode=255) with output: main : command provided 0

main : run as user is hdfs

main : requested yarn user is hdfs

Requested user hdfs is banned

这是因为yarn禁止了hdfs用户调度资源,解决方案如下:

yarn_ban_user.png

把hdfs这行删了就好,重启问题解决

impala问提

部分节点集成kerberos后启动impala daemon报错,类似以下错误

(SASL(-4): no mechanism available: No worthy mechs found)

这时候在报错的节点安装以下

yum install cyrus-sasl-plain cyrus-sasl-devel cyrus-sasl-gssapi

重启

kafka使用问题

impala jdbc使用问题

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值