测试Hive集成Sentry

本文在安装和配置Sentry基础之上测试Hive集成Sentry。注意:这里Hive中并没有配置Kerberos认证

关于配置了Kerberos的Hive集群如何集成Sentry,请参考配置安全的Hive集群集成Sentry

1. 配置Sentry

安装和配置Sentry

2. 配置Hive

Hive Metastore集成Sentry

需要在 /etc/hive/conf/hive-site.xml中添加:

    <property>
      <name>hive.metastore.pre.event.listeners</name>
      <value>org.apache.sentry.binding.metastore.MetastoreAuthzBinding</value>
    </property>
    <property>
      <name>hive.metastore.event.listeners</name>
      <value>org.apache.sentry.binding.metastore.SentryMetastorePostEventListener</value>
    </property>

Hive-server2集成Sentry

修改 /etc/hive/conf/hive-site.xml,添加以下内容:

    <property>
      <name>hive.server2.enable.impersonation</name>
      <value>true</value>
    </property>
    <property>
        <name>hive.security.authorization.task.factory</name>
        <value>org.apache.sentry.binding.hive.SentryHiveAuthorizationTaskFactoryImpl</value>
    </property>
    <property>
        <name>hive.server2.session.hook</name>
        <value>org.apache.sentry.binding.hive.HiveAuthzBindingSessionHook</value>
    </property>
    <property>
        <name>hive.sentry.conf.url</name>
        <value>file:///etc/hive/conf/sentry-site.xml</value>
    </property>

参考模板sentry-site.xml.hive-client.template在 /etc/hive/conf/ 目录创建 sentry-site.xml:

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <property>
       <name>sentry.service.client.server.rpc-port</name>
       <value>8038</value>
    </property>
    <property>
       <name>sentry.service.client.server.rpc-address</name>
       <value>cdh1</value>
    </property>
    <property>
       <name>sentry.service.client.server.rpc-connection-timeout</name>
       <value>200000</value>
    </property>

    <!--以下是客户端配置-->
    <property>
        <name>sentry.provider</name>
        <value>org.apache.sentry.provider.file.HadoopGroupResourceAuthorizationProvider</value>
    </property>
    <property>
        <name>sentry.hive.provider.backend</name>
        <value>org.apache.sentry.provider.db.SimpleDBProviderBackend</value>
    </property>
    <property>
        <name>sentry.metastore.service.users</name>
        <value>hive</value><!--queries made by hive user (beeline) skip meta store check-->
    </property>
      <property>
        <name>sentry.hive.server</name>
        <value>server1</value>
      </property>
     <property>
        <name>sentry.hive.testing.mode</name>
        <value>true</value>
     </property>
</configuration>

3. 重启Hive

在cdh1上启动或重启hiveserver2:

$ /etc/init.d/hive-server2 restart

4. 准备测试数据

参考 Securing Impala for analysts,准备测试数据:

$ cat /tmp/events.csv
10.1.2.3,US,android,createNote
10.200.88.99,FR,windows,updateNote
10.1.2.3,US,android,updateNote
10.200.88.77,FR,ios,createNote
10.1.4.5,US,windows,updateTag

然后,在hive中运行下面 sql 语句:

create database sensitive;

create table sensitive.events (
    ip STRING, country STRING, client STRING, action STRING
  ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

load data local inpath '/tmp/events.csv' overwrite into table sensitive.events;
create database filtered;
create view filtered.events as select country, client, action from sensitive.events;
create view filtered.events_usonly as select * from filtered.events where country = 'US';

在 cdh1上通过 beeline 连接 hiveserver2,运行下面命令创建角色和组:

# 注意:我设置了hiveserver2的jdbc端口为10001
$  beeline -u "jdbc:hive2://cdh1:10001/" -n hive -p hive -d org.apache.hive.jdbc.HiveDriver

执行下面的 sql 语句创建 role、group等:

create role admin_role;
GRANT ALL ON SERVER server1 TO ROLE admin_role;
GRANT ROLE admin_role TO GROUP admin;
GRANT ROLE admin_role TO GROUP hive;

create role test_role;
GRANT ALL ON DATABASE filtered TO ROLE test_role;
GRANT ROLE test_role TO GROUP test;

上面创建了两个角色:

  • admin_role,具有管理员权限,可以读写所有数据库,并授权给 admin 和 hive 组(对应操作系统上的组)
  • test_role,只能读写 filtered 数据库,并授权给 test 组。

因为系统上没有test用户和组,所以需要手动创建:

$ useradd test

5. 测试

测试admin_role角色

使用hive用户访问beeline:

$ beeline -u "jdbc:hive2://cdh1:10000/" -n hive -p hive -d org.apache.hive.jdbc.HiveDriver

查看当前系统用户是谁:

0: jdbc:hive2://cdh1:10000/> set system:user.name;
+------------------------+--+
|          set           |
+------------------------+--+
| system:user.name=hive  |
+------------------------+--+
1 row selected (0.188 seconds)

hive属于admin_role组,具有管理员权限,可以查看所有角色:

0: jdbc:hive2://cdh1:10000/> show roles;
+-------------+--+
|    role     |
+-------------+--+
| test_role   |
| admin_role  |
+-------------+--+
2 rows selected (0.199 seconds)

查看所有权限:

0: jdbc:hive2://cdh1:10000/> SHOW GRANT ROLE test_role;
+-----------+--------+------------+---------+-----------------+-----------------+------------+---------------+---------------+
| database  | table  | partition  | column  | principal_name  | principal_type  | privilege  | grant_option  | grant_time | 
+-----------+--------+------------+---------+-----------------+-----------------+------------+---------------+---------------+
| filtered   |        |            |         | test_role       | ROLE            | *          | false       | 1430293474047000  | 
+-----------+--------+------------+---------+-----------------+-----------------+------------+---------------+---------------+

0: jdbc:hive2://cdh1:10000/> SHOW GRANT ROLE admin_role;
+-----------+--------+------------+---------+-----------------+-----------------+------------+---------------+---------------+
| database  | table  | partition  | column  | principal_name  | principal_type  | privilege  | grant_option  | grant_time  | 
+-----------+--------+------------+---------+-----------------+-----------------+------------+---------------+---------------+
| *         |        |            |         | admin_role      | ROLE            | *          | false       | 1430293473308000  
+-----------+--------+------------+---------+-----------------+-----------------+------------+---------------+---------------+
1 row selected (0.16 seconds)

hive用户可以查看所有数据库、访问所有表:

0: jdbc:hive2://cdh1:10000/> show databases;
+----------------+--+
| database_name  |
+----------------+--+
| default        |
| filtered       |
| sensitive      |
+----------------+--+
3 rows selected (0.391 seconds)
0: jdbc:hive2://cdh1:10000/> use filtered;
No rows affected (0.101 seconds)
0: jdbc:hive2://cdh1:10000/> select * from filtered.events;
+-----------------+----------------+----------------+--+
| events.country  | events.client  | events.action  |
+-----------------+----------------+----------------+--+
| US              | android        | createNote     |
| FR              | windows        | updateNote     |
| US              | android        | updateNote     |
| FR              | ios            | createNote     |
| US              | windows        | updateTag      |
+-----------------+----------------+----------------+--+
5 rows selected (0.431 seconds)
0: jdbc:hive2://cdh1:10000/> select * from sensitive.events;
+---------------+-----------------+----------------+----------------+--+
|   events.ip   | events.country  | events.client  | events.action  |
+---------------+-----------------+----------------+----------------+--+
| 10.1.2.3      | US              | android        | createNote     |
| 10.200.88.99  | FR              | windows        | updateNote     |
| 10.1.2.3      | US              | android        | updateNote     |
| 10.200.88.77  | FR              | ios            | createNote     |
| 10.1.4.5      | US              | windows        | updateTag      |
+---------------+-----------------+----------------+----------------+--+
5 rows selected (0.247 seconds)

测试test_role角色

使用test用户访问beeline:

$ beeline -u "jdbc:hive2://cdh1:10000/" -n test -p test -d org.apache.hive.jdbc.HiveDriver

查看当前系统用户是谁:

0: jdbc:hive2://cdh1:10000/> set system:user.name;
+------------------------+--+
|          set           |
+------------------------+--+
| system:user.name=hive  |
+------------------------+--+
1 row selected (0.188 seconds)

test用户不是管理员,是不能查看所有角色的:

0: jdbc:hive2://cdh1:10000/> show roles;
ERROR : Error processing Sentry command: Access denied to test. Server Stacktrace: org.apache.sentry.provider.db.SentryAccessDeniedException: Access denied to test

test用户可以列出所有数据库:

0: jdbc:hive2://cdh1:10000/> show databases;
+----------------+--+
| database_name  |
+----------------+--+
| default        |
| filtered       |
| sensitive      |
+----------------+--+
3 rows selected (0.079 seconds)

test用户可以filtered库:

0: jdbc:hive2://cdh1:10000/> use filtered;
No rows affected (0.206 seconds)
0: jdbc:hive2://cdh1:10000/> select * from events;
+-----------------+----------------+----------------+--+
| events.country  | events.client  | events.action  |
+-----------------+----------------+----------------+--+
| US              | android        | createNote     |
| FR              | windows        | updateNote     |
| US              | android        | updateNote     |
| FR              | ios            | createNote     |
| US              | windows        | updateTag      |
+-----------------+----------------+----------------+--+
5 rows selected (0.361 seconds)

但是,test用户没有权限访问sensitive库:

0: jdbc:hive2://cdh1:10000/> use sensitive;
Error: Error while compiling statement: FAILED: SemanticException No valid privileges
 Required privileges for this query: Server=server1->Db=sensitive->Table=*->action=insert;Server=server1->Db=sensitive->Table=*->action=select; (state=42000,code=40000)

6. 排错

在CDH5的高版本中,hive cli 不建议使用,在hive集成sentry之后,再运行hive cli 会提示找不到sentry的类的遗产,解决办法是,将sentry相关的jar包链接到hive的home目录下的lib目录下:

ln -s /usr/lib/sentry/lib/sentry-binding-hive.jar /usr/lib/hive/lib/
ln -s /usr/lib/sentry/lib/sentry-core-common.jar /usr/lib/hive/lib
ln -s /usr/lib/sentry/lib/sentry-core-common-db.jar /usr/lib/hive/lib
ln -s /usr/lib/sentry/lib/sentry-policy-common.jar /usr/lib/hive/lib
ln -s /usr/lib/sentry/lib/sentry-policy-db.jar /usr/lib/hive/lib
ln -s /usr/lib/sentry/lib/sentry-policy-cache.jar /usr/lib/hive/lib
ln -s /usr/lib/sentry/lib/sentry-provider-common.jar /usr/lib/hive/lib
ln -s /usr/lib/sentry/lib/sentry-provider-db.jar /usr/lib/hive/lib
ln -s /usr/lib/sentry/lib/sentry-provider-cache.jar /usr/lib/hive/lib
ln -s /usr/lib/sentry/lib/sentry-provider-file.jar /usr/lib/hive/lib
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Hudi和Hive集成可以通过将Hudi表的数据映射为Hive外部表来实现。这样,Hive就可以方便地对Hudi表进行实时视图、读优化视图和增量查询。具体的步骤如下: 1. 首先,需要创建一个Hudi目录,并指定Hive的配置文件路径。可以使用以下代码创建目录: ``` CREATE CATALOG hive_catalog WITH ( 'type' = 'hive', 'default-database' = 'default', 'hive-conf-dir' = '/home/apache-hive-3.1.2-bin/conf' ); ``` 2. 进入Hudi目录并选择要使用的数据库。可以使用以下代码进入目录和选择数据库: ``` USE CATALOG hive_catalog; use test; ``` 3. 创建Hudi表,并指定Hudi的连接器类型、数据路径、表类型等信息。可以使用以下代码创建表: ``` create table t_catalog_cow2 ( id int primary key, num int, ts int ) partitioned by (num) with ( 'connector' = 'hudi', 'path' = 'hdfs://hp5:8020/user/hive/warehouse/test.db/t_catalog_cow2', 'table.type' = 'COPY_ON_WRITE', 'hive_sync.enable' = 'true', 'hive_sync.table' = 't_catalog_cow2', 'hive_sync.db' = 'test', 'hive_sync.mode' = 'hms', 'hive_sync.metastore.uris' = 'thrift://hp5:9083', 'hive_sync.conf.dir'='/home/apache-hive-3.1.2-bin/conf' ); ``` 4. 最后,可以向Hudi表中插入数据。可以使用以下代码插入数据: ``` insert into t_catalog_cow2 values (1,1,1); ``` 以上是Hudi和Hive集成的一般步骤,通过这种方式,可以在Hive中方便地对Hudi表进行查询和操作。\[1\]\[2\]\[3\] #### 引用[.reference_title] - *1* *2* *3* [Hudi系列13:Hudi集成Hive](https://blog.csdn.net/u010520724/article/details/128850750)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^insertT0,239^v3^insert_chatgpt"}} ] [.reference_item] [ .reference_list ]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值