我的环境:
Red Hat Enterprise Linux Server release 5.5 (查看系统版本的命令:lsb_release -a) 64位
kerberos版本:krb5
Java 环境:jdk1.7.0_51
Hadoop版本:编译过源码的2.2.0
Hbase版本:0.94-17
zookeeper版本:3.4.6 (我们没有使用hbase自带的zookeeper)
注意:
1.在配置安全的hbase之前,请保证已经正确配置安全的hadoop和安全的zookeeper。
2.虽然Hbase Thrift服务器和HBase REST服务器可以连接到一个安全的hadoop集群,但从Hbase Thrift和REST服务器访问的客户端都不是安全的。
3.因为hbase0.94版本不完全支持hadoop2.2.0,所以要编译之后才能使用,而要配置security,则要在编译的时候加入-Psecurity这个选项。
综述
配置HBase安全分为两个部分:
1.配置Hbase验证
2.配置Hbase鉴权
配置Hbase Authentication
1. 配置Hbase Server来向HDFS进行身份验证
1)启动Hbase身份验证:
a)在每台Hbase server主机上(master或者regionserver),在hbase-site.xml中加入以下配置信息:
<property>
<name>hbase.security.authentication</name>
<value>kerberos</value>
</property>
<property>
<name>hbase.rpc.engine</name>
<value>org.apache.hadoop.hbase.ipc.SecureRpcEngine</value>
</property>
b)在每台Hbase client的主机上(跑hbase shell的主机),在hbase-site.xml中加入以下信息:
<property>
<name>hbase.security.authentication</name>
<value>kerberos</value>
</property>
<property>
<name>hbase.rpc.engine</name>
<value>org.apache.hadoop.hbase.ipc.SecureRpcEngine</value>
</property>
2)配置Hbase的kerberos principal
a)为hbase server创建一个服务的principal
kadmin:addprinc -randkey hbase/vbaby1.cloud.eb@CLOUD.EB
kadmin:addprinc -randkey hbase/vbaby2.cloud.eb@CLOUD.EB
kadmin:addprinc -randkey hbase/vbaby3.cloud.eb@CLOUD.EB
b)为每个HBasse server创建一个keytab
kadmin:xst -k hbase.keytab hbase/vbaby1.cloud.eb
kadmin:xst -k hbase.keytab hbase/vbaby2.cloud.eb
kadmin:xst -k hbase.keytab hbase/vbaby3.cloud.eb
c)将keytab文件拷贝到hbase的配置目录,并设置其权限如下:
-r-------- 1 hbase hbase 140 Apr 1 11:17 hbase.keytab
d)之后在hbase-site.xml文件中加入以下信息:
<property>
<name>hbase.master.kerberos.principal</name>
<value>hbase/
_HOST@CLOUD.EB</value>
</property>
<property>
<name>hbase.master.keytab.file</name>
<value>/home/hbase/hbase-self/hbase-0.94.17-security/conf/hbase.keytab</value>
</property>
<property>
<name>hbase.regionserver.kerberos.principal</name>
<value>hbase/
_HOST@CLOUD.EB</value>
</property>
<property>
<name>hbase.regionserver.keytab.file</name>
<value>/home/hbase/hbase-self/hbase-0.94.17-security/conf/hbase.keytab</value>
</property>
注意:
其中_HOST可以根据连接的时候自动匹配目标地址的hostname,所以不需要配置成特别的主机名
2.配置Hbase server和client来对安全的zookeeper进行认证
我没有使用hbase的zookeeper,而是使用了独立的zookeeper
1)配置hbase JVMs的JAAS
a)在每台机器上,通过编写zk-jaas.conf文件来启动JAAS,该文件应该放在hbase的配置文件目录:
Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
useTicketCache=false
keyTab="/home/hbase/hbase-self/hbase-0.94.17-security/conf/hbase.keytab"
principal="hbase/
vbaby2.cloud.eb@CLOUD.EB";
};
不同的主机红色的部分需要替换
b)修改hbase-env.sh文件,加入以下内容:
export HBASE_OPTS="-Djava.security.auth.login.config=/home/hbase/hbase-self/hbase-0.94.17-security/conf/zk-jaas.conf"
export HBASE_MASTER_OPTS="-Djava.security.auth.login.config=/home/hbase/hbase-self/hbase-0.94.17-security/conf/zk-jaas.conf"
export HBASE_REGIONSERVER_OPTS="-Djava.security.auth.login.config=/home/hbase/hbase-self/hbase-0.94.17-security/conf/zk-jaas.conf"
2)配置鉴权使得hbase可以连接到安全的zookeeper
a)在每台机器上的hbase-site.xml文件中添加:
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>
vbaby1.cloud.eb,vbaby2.cloud.eb,vbaby3.cloud.eb</value>
</property>
注意:标红部分是运行server的机器的主机名
b)在zoo.cfg文件中加入:
kerberos.removeHostFromPrincipal=true
kerberos.removeRealmFromPrincipal=true
配置Hbase Authorization
hbase的authorization机制是建立在Coprocessors框架上的,尤其是 AccessContoller Coprocessor.
1.启动Hbase的Authorization,在hbase-site.xml文件中加入以下信息:
<property>
<name>hbase.security.authorization</name>
<value>true</value>
</property>
<property>
<name>hbase.coprocessor.master.classes</name>
<value>org.apache.hadoop.hbase.security.access.AccessController</value>
</property>
<property>
<name>hbase.coprocessor.region.classes</name>
<value>org.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController</value>
</property>
2.为Authorization配置ACL
通过Hbase shell来配置hbase的ACL,命令格式如下:
$grant <user> <permissions>[ <table>[ <column family>[ <column qualifier> ] ] ] #grants permissions
$revoke <user> <permissions> [ <table> [ <column family> [ <column qualifier> ] ] ] # revokes permissions
$user_permission <table> # displays existing permissions
在上面的命令中,在<>中的是变量,在[]中的是选项,permission必须是包含0或者由"RWCA"四个字母组成的集合中的一部分。其中:
R 读 :
Get,
Scan, or
Exists
W 写 :
Put,
Delete,
LockRow,
UnlockRow,
IncrementColumnValue,
CheckAndDelete,
CheckAndPut,
Flush, or
Compact
C 创建:
Create,
Alter, or
Drop
A Admin:
Enable,
Disable,
MajorCompact,
Grant,
Revoke, and
Shutdown.
举例:
grant 'user1', 'RWC'
grant 'user2', 'RW', 'tableA'
遇到的问题及解决办法
1.Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile (default-compile) on project hbase: Compilation failure: Compilation failure
解决办法:这是一类问题,总的来说很有可能是官网网速太慢导致某些lib没有下载下来,但是时间超时了,重新编译就好。
2.HRegionServer进程无法启动或者启动之后无法连接到Hmaster,如果用start-hbase.sh这个脚本来启动,会出现以下错误:
2014-03-25 20:30:32,866 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hbase (auth:SIMPLE) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
2014-03-25 20:30:32,868 WARN org.apache.hadoop.ipc.SecureClient: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
2014-03-25 20:30:32,869 FATAL org.apache.hadoop.ipc.SecureClient: SASL authentication failed. The most likely cause is missing or invalid credentials. Consider 'kinit'.
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:140)
at org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.setupSaslConnection(SecureClient.java:182)
at org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.access$700(SecureClient.java:90)
at org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection$2.run(SecureClient.java:289)
at org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection$2.run(SecureClient.java:286)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hbase.util.Methods.call(Methods.java:37)
at org.apache.hadoop.hbase.security.User.call(User.java:624)
at org.apache.hadoop.hbase.security.User.access$600(User.java:52)
at org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:478)
at org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.setupIOstreams(SecureClient.java:285)
at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1141)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:988)
at org.apache.hadoop.hbase.ipc.SecureRpcEngine$Invoker.invoke(SecureRpcEngine.java:107)
at com.sun.proxy.$Proxy8.getProtocolVersion(Unknown Source)
at org.apache.hadoop.hbase.ipc.SecureRpcEngine.getProxy(SecureRpcEngine.java:149)
at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208)
at org.apache.hadoop.hbase.regionserver.HRegionServer.getMaster(HRegionServer.java:2058)
at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2104)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:753)
at java.lang.Thread.run(Thread.java:744)
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121)
at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
... 27 more
如果单独启动regionserver,又会出现以下的错误:
2014-03-27 16:20:53,895 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/usr/java/jdk1.7.0_51/jre
"hbase-hbase-regionserver-vbaby3.cloud.eb.log" 156L, 35891C 1,1 Top
at org.apache.hadoop.hbase.regionserver.HRegionServer.createMyEphemeralNode(HRegionServer.java:1148)
at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1109)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:758)
at java.lang.Thread.run(Thread.java:744)
2014-03-27 16:20:55,636 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server vbaby3.cloud.eb,60020,1395908451026: Unhandled exception: Region server startup failed
java.io.IOException: Region server startup failed
at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1279)
at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1136)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:758)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses.
at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
at org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionServerInfo.getSerializedSize(HBaseProtos.java:883)
at com.google.protobuf.AbstractMessageLite.toByteArray(AbstractMessageLite.java:62)
at org.apache.hadoop.hbase.regionserver.HRegionServer.createMyEphemeralNode(HRegionServer.java:1148)
at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1109)
... 2 more
解决办法:
第一个错误很有可能是配置文件的问题,也有可能是配置安全机制之后就不能使用start-hbase.sh来启动hbase。
第二个错误是主要的错误,问题出在protobuf这个jar包上。因为hbase0.94是使用2.4.0a版本的protobuf进行编译的,而为了使其适用于hadoop2.2.0,我们使用了重新编译了hbase,然后替换掉了protobuf的jar包,替换成为了2.5.0的,但是这样就会有问题,会出现上面所说的问题。查看源码之后,发现GeneratedMessage这个类是通过proto文件编译生成的,但是当时用的就是2.4.0a这个版本的protobuf来编译的。所以我们使用了protobuf2.5.0来编译这个hbase.proto和ErrorHandling.proto文件,生成了两个java文件。然后将其拷贝到相应的源码目录,再用其进行编译,之后问题就解决了。具体过程见附录A proto文件的编译
3.在启动master的时候,虽然可以启动,但是报错:
ERROR org.apache.hadoop.hbase.master.HMaster: Coprocessor postStartMaster() hook failed
org.apache.hadoop.hbase.TableExistsException: _acl_
at org.apache.hadoop.hbase.master.handler.CreateTableHandler.<init>(CreateTableHandler.java:104)
at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1334)
at org.apache.hadoop.hbase.security.access.AccessControlLists.init(AccessControlLists.java:119)
at org.apache.hadoop.hbase.security.access.AccessController.postStartMaster(AccessController.java:739)
at org.apache.hadoop.hbase.master.MasterCoprocessorHost.postStartMaster(MasterCoprocessorHost.java:624)
at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:699)
at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:433)
at java.lang.Thread.run(Thread.java:744)
因为这个错误,在创建表的时候都会报错,查看日志,有以下的warning:
014-03-31 10:27:08,740 WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Encountered problems when prefetch META table:
org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for table: _acl_, row=_acl_,,99999999999999
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:151)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:1059)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1121)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1001)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:958)
at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:251)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:155)
at org.apache.hadoop.hbase.security.access.AccessControlLists.removeTablePermissions(AccessControlLists.java:207)
at org.apache.hadoop.hbase.security.access.AccessController.postDeleteTable(AccessController.java:596)
at org.apache.hadoop.hbase.master.MasterCoprocessorHost.postDeleteTable(MasterCoprocessorHost.java:151)
at org.apache.hadoop.hbase.master.HMaster.deleteTable(HMaster.java:1377)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hbase.ipc.SecureRpcEngine$Server.call(SecureRpcEngine.java:311)
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1434)
解决办法:日志中提到缺少了_acl_这个表,而master启动的时候初始化这个表失败了。原因其实在于zookeeper,由于之前的一些操作造成了zookeeper中关于hbase的数据有误,删除zookeeper的数据之后问题解决。但是如果是运行比较久的集群出现这个问题,不能简单删除数据的时候,就要寻求别的解决方法了。
4.只能启动一个安全认证的regionserver。
解决办法:这是配置文件的错误,按照之前提到的修改hbase-site.xml文件。尤其是_HOST这个字段不得进行修改。
附录A proto文件的编译
参考连接:
4.http://hbase.apache.org/book/security.html#hbase.secure.configuration
5.hbase官方文档:中文版