前言:
Sentry是Hadoop安全方面的一个开源组件,目前还在孵化中,地址:https://sentry.incubator.apache.org。 原属于Cloudera开发,后来贡献给了Apache。关于它的强大直接摘录一段,全文请查看:http://www.csdn.net/article/2013-08-14/2816575-with-sentry-cloudera-fills-hadoops-enterprise-security-gap:
Sentry是一个Hadoop的授权模块,为了对正确的用户和应用程序提供精确的访问级别,Sentry提供了细粒度级、基于角色的授权以及多租户的管理模式,为Hadoop使用者提供了以下便利:
能够在Hadoop中存储更敏感的数据
使更多的终端用户拥有Hadoop数据访问权
创建更多的Hadoop使用案例
构建多用户应用程序
符合规范(例如SOX,PCI,HIPAA,EAL3)
本文(也可能是系列)主要讲述使用中趟过的那些坑,和具体的使用参考。
一、安装
我这边使用的是cdh5.4.3版本,sentry的版本是1.4.0,直接找到cloudera官方,找到tarball下载、解压即可。然后修改下 /etc/profile,加到环境变量中
export SENTRY_HOME=/home/hadoop/apache-sentry-1.4.0-cdh5.4.3-bin/
export PATH=$SENTRY_HOME/bin:$PATH
二、修改配置
参考文章:http://blog.javachen.com/2015/04/30/install-and-config-sentry.html、
http://gethue.com/apache-sentry-made-easy-with-the-new-hue-security-app/ (一定要详读,很有用,下面也会说到)
重点关注:
1、数据库的配置,注释掉的为Mysql的配置参照
<property>
<name>sentry.store.jdbc.url</name>
<!--<value>jdbc:mysql://host:port/sentry</value>-->
<value>jdbc:derby:;databaseName=metastore_db;create=true</value>
<description>JDBC connection URL for the backed DB</description>
</property>
<property>
<name>sentry.store.jdbc.user</name>
<value></value>
<description>Userid for connecting to backend db </description>
</property>
<property>
<name>sentry.store.jdbc.password</name>
<value></value>
<description>Sentry password for backend JDBC user </description>
</property>
<property>
<name>sentry.store.jdbc.driver</name>
<!--<value>com.mysql.jdbc.Driver</value>-->
<value>org.apache.derby.jdbc.EmbeddedDriver</value>
<description>Backend JDBC driver - org.apache.derby.jdbc.EmbeddedDriver (only when dbtype = derby) JDBC Driver class for the backed DB</description>
</property>
2、 顾名思义,下面的参数分别表示允许连接的用户,和管理员的组,很重要,后面详细说明。
<property>
<name>sentry.service.allow.connect</name>
<value>hive,hue,jerrickwang</value>
<description>comma separated list of users - List of users that are allowed to connect to the service (eg Hive, Impala) </description>
</property>
<property>
<name>sentry.service.admin.group</name>
<value>admin</value>
<description>Comma separates list of groups. List of groups allowed to make policy updates</description>
</property>
3、sentry的组映射,默认配置HadoopGroupMappingService,也可以使用LocalGroupMapping ,但是使用后者的时候需要指定police file的地址。
<property>
<name>sentry.store.group.mapping</name>
<value>org.apache.sentry.provider.common.HadoopGroupMappingService</value>
<description>
Group mapping class for Sentry service. org.apache.sentry.provider.file.LocalGroupMapping service can be used for local group mapping. </description>
</property>
<property>
<name>sentry.store.group.mapping.resource</name>
<value></value>
<description> Policy file for group mapping. Policy file path for local group mapping, when sentry.store.group.mapping is set to LocalGroupMapping Service class.</d
escription>
</property>
三、初始化数据库,启动service,然后按照host和端口修改Hue配置:
3.1 如果是mysql需要先创建sentry库,然后初始化(可选)。如果是用derby,在jdbc串中配置create=true,可跳过此步
create database sentry
sentry --command schema-tool -initSchema -conffile conf/sentry-site.xml -dbType mysql
3.2 启动
cd $SENTRY_HOME
sentry --command service -conffile conf/sentry-site.xml
3.3 修改Hue配置,重启hue
[libsentry]
# Hostname or IP of server.
hostname=localhost
# Port the sentry service is running on.
port=8038
# Sentry configuration directory, where sentry-site.xml is located.
sentry_conf_dir=/home/hadoop/apache-sentry-1.4.0-cdh5.4.3-bin/conf
四、问题出现
Hue中: jerrickwang用户:default + admin组
senrty中:
<name>sentry.service.admin.group</name>
<value>admin</value>
但是登陆后,一直报错,找不到组:
15/08/17 10:10:28 WARN security.ShellBasedUnixGroupsMapping: got exception trying to get groups for user jerrickwang: id: jerrickwang: No such user
尝试添加role,报错:
15/08/17 10:15:06 WARN thrift.SentryPolicyStoreProcessor: User: jerrickwang is part of [] which does not, intersect admin groups [admin]
15/08/17 11:11:40 WARN common.HadoopGroupMappingService: Unable to obtain groups for jerrickwang
java.io.IOException: No groups found for user jerrickwang
看样子是用户和分组的问题,查看配置:
<property>
<name>sentry.store.group.mapping</name>
<value>org.apache.sentry.provider.common.HadoopGroupMappingService</value>
各种搜索无果,果断自己看代码,官方下载sentry1.5源码,导入到eclipse:
provider-common包中找到HadoopGroupMappingService.class ,代码很少,重点关注定义了一个org.apache.hadoop.security.Groups;
public Set<String> getGroups(String user) {
try {
return new HashSet<String>(groups.getGroups(user));
} catch (IOException e) {
LOGGER.warn("Unable to obtain groups for " + user, e);
}
return Collections.emptySet();
}
查看Groups类,hadoop-common包:
public List<String> getGroups(String user){
List staticMapping = (List)this.staticUserToGroupsMap.get(user);
CachedGroups groups = (CachedGroups)this.userToGroupsMap.get(user);
if (groups.getGroups().isEmpty())
{
throw new IOException("No groups found for user " + user);
}
}
构造函数中:
this.impl = ((GroupMappingServiceProvider)ReflectionUtils.newInstance(conf.getClass("hadoop.security.group.mapping", ShellBasedUnixGroupsMapping.class, GroupMappingServiceProvider.class), conf));
同级目录下找到:ShellBasedUnixGroupsMapping类
private static List<String> getUnixGroups(String user)
throws IOException
{
String result = "";
try {
result = Shell.execCommand(Shell.getGroupsForUserCommand(user));
}
catch (Shell.ExitCodeException e) {
LOG.warn("got exception trying to get groups for user " + user + ": " + e.getMessage());
return new LinkedList();
}
看样子是Sentry的默认分组竟然是这样的,完全没想到的点:从linux系统获取用户的组,而jerrickwang的用户在linux中不存在,也没用组,所以报错。
cat /etc/group查看系统组,基本是root, hadoop,work。
因为Hue上是接入ldap服务作为验证,所以不能添加账号,以上面的结论为基础 向linux添加jerrickwang用户,分组给hadoop,sentry admin组给hadoop应该即可。
-- 按此修改,重启sentry,果然成功了!
五、更多一层的验证:
1、使用ldap服务,确保linux系统上有此用户
2、确保权限:
To be able to edit roles and privileges in Hue, the logged-in Hue user needs to belong to a group in Hue that is also an admin group in Sentry. For example, our ‘hive’ user belongs to a ‘hive’ group in Hue and also to a ‘hive’ group in Sentry:
<
property
>
<
name
>sentry.service.admin.group</
name
>
<
value
>hive,impala,hue</
value
>
</
property
>
我的:
Linux: jerrickwang -- 所属hadoop组
Hue: jerrickwang admin组
配置:
<name>sentry.service.admin.group</name>
1、 <value>admin</value> 权限不够,不能添加,基本确定sentry的组就是linux的组
2、 <value>hadoop</value> ,hue中组设成hadoop 搞定!
六、数据库
1、用mysql时我是一直没成功了,各种办法都试过了也没解决掉这个问题:
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was too long; max key length is 767 bytes
数据库各种改过,甚至改了sentry初始化时调用的sql(在script目录下)
2、使用derby时不要自己做初始化,我这边自己初始化后,在添加角色时一直抛出一个股东error:
Caused by: ERROR 42Z23: Attempt to modify an identity column 'ROLE_ID':
bug:https://issues.apache.org/jira/browse/DERBY-1495
Connected to: Apache Derby (version 10.10.2.0 - (1582446)) 但是此版本早就修复了
3、使用ij访问derby
下载db-deby,解压,bin目录下有连接工具ij,配置环境变量
$cd $SENTRY_HOMT
$ ij
ij version 10.11
ij> connect 'jdbc:derby:;databaseName=metastore_db';
ij>
show tables;
TABLE_SCHEM |TABLE_NAME |REMARKS
------------------------------------------------------------------------
SYS |SYSALIASES |
SYS |SYSCHECKS |
SYS |SYSCOLPERMS |
SYS |SYSCOLUMNS |
SYS |SYSCONGLOMERATES |
SYS |SYSCONSTRAINTS |
SYS |SYSDEPENDS |
SYS |SYSFILES |
SYS |SYSFOREIGNKEYS |
SYS |SYSKEYS |
SYS |SYSPERMS |
SYS |SYSROLES |
SYS |SYSROUTINEPERMS |
SYS |SYSSCHEMAS |
SYS |SYSSEQUENCES |
SYS |SYSSTATEMENTS |
SYS |SYSSTATISTICS |
SYS |SYSTABLEPERMS |
SYS |SYSTABLES |
SYS |SYSTRIGGERS |
SYS |SYSUSERS |
SYS |SYSVIEWS |
SYSIBM |SYSDUMMY1 |
SENTRY |SENTRY_DB_PRIVILEGE |
SENTRY |SENTRY_GROUP |
SENTRY |SENTRY_ROLE |
SENTRY |SENTRY_ROLE_DB_PRIVILEGE_MAP |
SENTRY |SENTRY_ROLE_GROUP_MAP |
SENTRY |SENTRY_VERSION |
SENTRY |SEQUENCE_TABLE |
30 rows selected
ij>
exit;
七、Next
1、Sentry分组中的LocalGroupMapping使用
2、Sentry集成Hive