Trino结合Ranger 2.3控制访问权限

Ranger 2.3 添加了对Trino的支持,之前是presto,还需要修改不少代码,现在可以直接使用Ranger控制Trino的权限了

1.Ranger安装

1.1 编译

Ranger github 地址:https://github.com/apache/ranger

可以拉取2.3版本的代码自行编译

注意!!!

Trino是基于JDK11的,使用maven编译时也要在JDK11环境下才会编译trino的plugin

编译时最好修改pom.xml文件中的 <trino.version>377</trino.version> property为当前自己使用的trino版本,避免可能出现的兼容性问题

执行如下命令进行编译

mvn -DskipTests -Dpmd.skip=true -Drat.skip=true  clean  compile package install

编译完成后可以在target目录下看到如下tar包

在这里插入图片描述

1.2 修改配置文件

ranger-2.3.0-admin.tar.gz 复制出来,解压

tar -zxvf ranger-2.3.0-admin.tar.gz

修改install.properties文件,主要需要关注的配置如下

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#
# This file provides a list of the deployment variables for the Policy Manager Web Application
#

#------------------------- DB CONFIG - BEGIN ----------------------------------
# Uncomment the below if the DBA steps need to be run separately
#setup_mode=SeparateDBA

# ranger 运行需要python3的环境,没有的自行安装
PYTHON_COMMAND_INVOKER=python3

#DB_FLAVOR=MYSQL|ORACLE|POSTGRES|MSSQL|SQLA
# 选择Mysql作为Ranger的元数据库
DB_FLAVOR=MYSQL
#

#
# Location of DB client library (please check the location of the jar file)
#
#SQL_CONNECTOR_JAR=/usr/share/java/ojdbc6.jar
#SQL_CONNECTOR_JAR=/usr/share/java/mysql-connector-java.jar
#SQL_CONNECTOR_JAR=/usr/share/java/postgresql.jar
#SQL_CONNECTOR_JAR=/usr/share/java/sqljdbc4.jar
#SQL_CONNECTOR_JAR=/opt/sqlanywhere17/java/sajdbc4.jar
# 需要配置jdbc driver jar的地址,自行下载放到本机相应目录
SQL_CONNECTOR_JAR=/ranger-2.3.0-admin/mysql-connector-java-5.1.49.jar


#
# DB password for the DB admin user-id
# **************************************************************************
# ** If the password is left empty or not-defined here,
# ** it will try with blank password during installation process
# **************************************************************************
#
#db_root_user=root|SYS|postgres|sa|dba
#db_host=host:port              # for DB_FLAVOR=MYSQL|POSTGRES|SQLA|MSSQL       #for example: db_host=localhost:3306
#db_host=host:port:SID          # for DB_FLAVOR=ORACLE                          #for SID example: db_host=localhost:1521:ORCL
#db_host=host:port/ServiceName  # for DB_FLAVOR=ORACLE                          #for Service example: db_host=localhost:1521/XE
# 连接的Mysql地址与root用户的用户名密码,此处需要root用户是为了创建ranger一系列的数据库,表和ranger用户
db_root_user=root
db_root_password=dataroot
db_host=mysql-1:3306


#
# DB UserId used for the Ranger schema
#
# 在mysql中自动创建的ranger用户
db_name=ranger23
db_user=ranger23
db_password=ranger23@dw

# 本机JAVA8 环境
JAVA_BIN='/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java'
JAVA_VERSION_REQUIRED='1.8'
JAVA_ORACLE='Java(TM) SE Runtime Environment'


# 此外还可以配置solr或ES用于Ranger Audit查询,这是可选的,此处我们没有配置,如果需要的话可以自己配置
#Source for Audit Store. Currently solr, elasticsearch and cloudwatch logs are supported.
# * audit_store is solr
# audit_store=solr

# * audit_solr_url Elasticsearch Host(s). E.g. 127.0.0.1
audit_elasticsearch_urls=
audit_elasticsearch_port=
audit_elasticsearch_protocol=
audit_elasticsearch_user=
audit_elasticsearch_password=
audit_elasticsearch_index=
audit_elasticsearch_bootstrap_enabled=true


# * audit_solr_url URL to Solr. E.g. http://<solr_host>:6083/solr/ranger_audits
audit_solr_urls=
audit_solr_user=
audit_solr_password=
audit_solr_zookeepers=

audit_solr_collection_name=ranger_audits
#solr Properties for cloud mode
audit_solr_config_name=ranger_audits
audit_solr_configset_location=
audit_solr_no_shards=1
audit_solr_no_replica=1
audit_solr_max_shards_per_node=1
audit_solr_acl_user_list_sasl=solr,infra-solr
audit_solr_bootstrap_enabled=true

1.3 初始化ranger-admin

此步骤会在mysql中创建相应库表,以及配置ranger运行的相关环境

在ranger安装目录下执行如下脚本
/ranger-2.3.0-admin/setup.sh

1.4 启动Ranger

初始化完成后,启动ranger-admin

ranger-admin start 

查询运行情况,Ranger日志路径在 ews/log下,eg:/ranger-2.3.0-admin/ews/logs

启动成功后,ranger 默认端口是6080,浏览器输入相应地址:你部署机器的IP:6080,即可进入ranger页面

默认用户名密码: admin / admin

在这里插入图片描述

我们看到已经有TRINO的service管理页面了,可以在这里添加不同的Trino权限管理服务
在这里插入图片描述
此处我们添加一个名为 trino 的service

在这里插入图片描述

注意此时点击Test Connection可能会报错连接失败,没有关系,先点击Add保存,然后再进入编辑页面点击Test Connection就会连接成功了,这是Ranger的一个BUG,不过不影响正常使用

在这里插入图片描述

创建成功后会生成一些默认的策略

在这里插入图片描述

2. Trino中添加Ranger plugin

2.1 修改install.properties

将之前编译Ranger时,在target目录下生成的 ranger-2.3.0-trino-plugin.tar.gz 复制出来解压

在解压后的安装目录下编辑install.properties文件,eg:vim /ranger-2.3.0-trino-plugin/install.properties

一般关注如下的配置项

#
# Location of Policy Manager URL
#
# Example:
# POLICY_MGR_URL=http://policymanager.xasecure.net:6080
# 安装的ranger地址
POLICY_MGR_URL=http://127.0.0.1:6080/

#
# This is the repository name created within policy manager
#
# Example:
# REPOSITORY_NAME=trinodev
# 注意这个name需要与ranger中创建的service name对应,即采用哪个service的权限规则
# 比如我们上面创建的service名为trino,此处配置也需要是trino
# 如果上面创建的service名为trino-test,此处可配置trino-test
# 以此对不同的trino集群分开进行权限管控
REPOSITORY_NAME=trino

# Configure INSTALL_ENV=docker if running trino in docker environment
#INSTALL_ENV=docker
#
# Name of the directory where the component's lib and conf directory exist.
# This location should be relative to the parent of the directory containing
# the plugin installation files.
#
# trino的安装地址
COMPONENT_INSTALL_DIR_NAME=/data/trino/

# Enable audit logs to Solr
#Example
#XAAUDIT.SOLR.ENABLE=true
#XAAUDIT.SOLR.URL=http://localhost:6083/solr/ranger_audits
#XAAUDIT.SOLR.ZOOKEEPER=
#XAAUDIT.SOLR.FILE_SPOOL_DIR=/var/log/trino/audit/solr/spool

# 如果上面Ranger安装时有配置搜索引擎查询Audit,这里也可以进行配置
XAAUDIT.SOLR.ENABLE=false
XAAUDIT.SOLR.URL=NONE
XAAUDIT.SOLR.USER=NONE
XAAUDIT.SOLR.PASSWORD=NONE
XAAUDIT.SOLR.ZOOKEEPER=NONE
XAAUDIT.SOLR.FILE_SPOOL_DIR=/var/log/trino/audit/solr/spool

# Enable audit logs to ElasticSearch
#Example
#XAAUDIT.ELASTICSEARCH.ENABLE=true
#XAAUDIT.ELASTICSEARCH.URL=localhost
#XAAUDIT.ELASTICSEARCH.INDEX=audit

XAAUDIT.ELASTICSEARCH.ENABLE=false
XAAUDIT.ELASTICSEARCH.URL=NONE
XAAUDIT.ELASTICSEARCH.USER=NONE
XAAUDIT.ELASTICSEARCH.PASSWORD=NONE
XAAUDIT.ELASTICSEARCH.INDEX=NONE
XAAUDIT.ELASTICSEARCH.PORT=NONE
XAAUDIT.ELASTICSEARCH.PROTOCOL=NONE

2.2 启用ranger trino plugin

执行该脚本启用ranger trino plugin

/ranger-2.3.0-trino-plugin/enable-trino-plugin.sh

执行完成后,会把ranger plugin部署到trino安装目录下

/data/trino/plugin/ranger

同时在Trino的etc目录下生成相关配置文件

trino/etc 目录下

ranger-policymgr-ssl.xml
ranger-trino-audit.xml
trino-ranger-plugin-logback.xml
ranger-trino-security.xml

2.3 修改配置文件

ranger-policymgr-ssl.xml

此文件是Ranger结合Hadoop kerberos认证相关的配置,我们只用到Trino,此文件可忽略

ranger-trino-audit.xml

此文件是Audit相关配置,可开启此配置,方便排查权限问题

 <property>

        <name>xasecure.audit.is.enabled</name>

        <value>true</value>

    </property>

ranger-trino-security.xml

这里配置了Ranger的Url地址,和使用的Ranger对应Service name,以及拉取权限规则的间隔时间等等,具体可以看文件内描述

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--
  Licensed to the Apache Software Foundation (ASF) under one or more
  contributor license agreements.  See the NOTICE file distributed with
  this work for additional information regarding copyright ownership.
  The ASF licenses this file to You under the Apache License, Version 2.0
  (the "License"); you may not use this file except in compliance with
  the License.  You may obtain a copy of the License at

      http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
--><configuration>

    <property>

        <name>ranger.plugin.trino.service.name</name>

        <value>trino</value>

        <description>
      Name of the Ranger service containing policies for this Trino instance
    </description>

    </property>

    <property>

        <name>ranger.plugin.trino.policy.source.impl</name>

        <value>org.apache.ranger.admin.client.RangerAdminRESTClient</value>

        <description>
      Class to retrieve policies from the source
    </description>

    </property>

    <property>

        <name>ranger.plugin.trino.policy.rest.url</name>

        <value>http://127.0.0.1:6080/</value>

        <description>
      URL to Ranger Admin
    </description>

    </property>

    <property>

        <name>ranger.plugin.trino.policy.rest.ssl.config.file</name>

        <value>/data/trino/etc/ranger-policymgr-ssl.xml</value>

        <description>
      Path to the file containing SSL details to contact Ranger Admin
    </description>

    </property>

    <property>

        <name>ranger.plugin.trino.policy.pollIntervalMs</name>

        <value>30000</value>

        <description>
      How often to poll for changes in policies?
    </description>

    </property>

    <property>

        <name>ranger.plugin.trino.policy.rest.client.connection.timeoutMs</name>

        <value>120000</value>

        <description>
      S3 Plugin RangerRestClient Connection Timeout in Milli Seconds
    </description>

    </property>

    <property>

        <name>ranger.plugin.trino.policy.rest.client.read.timeoutMs</name>

        <value>30000</value>

        <description>
      S3 Plugin RangerRestClient read Timeout in Milli Seconds
    </description>

    </property>


    <property>
        <name>ranger.plugin.trino.policy.cache.dir</name>
        <value>/data/trino/ranger_plugin/trino/policycache</value>
        <description>
        从ranger拉取的权限数据在本地的缓存目录
       </description>
    </property>
</configuration>

trino-ranger-plugin-logback.xml

ranger plugin日志配置,如果在 ranger-trino-audit.xml 中开启了audit,则可以配合这个日志配置,排查鉴权的详情

为方便查看鉴权的信息,我们添加了File日志,将其独立出来,配置如下,日志文件放在了trino的默认log目录下

<?xml version="1.0" encoding="UTF-8"?>
<!--
  Licensed to the Apache Software Foundation (ASF) under one or more
  contributor license agreements.  See the NOTICE file distributed with
  this work for additional information regarding copyright ownership.
  The ASF licenses this file to You under the Apache License, Version 2.0
  (the "License"); you may not use this file except in compliance with
  the License.  You may obtain a copy of the License at

      http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
-->

<configuration scan="true" scanPeriod="30 seconds">
  <appender name="console" class="ch.qos.logback.core.ConsoleAppender">
    <Target>System.out</Target>
    <encoder>
      <pattern>%-5p - %m</pattern>
    </encoder>
  </appender>

  <!--文件日志, 按照每天生成日志文件 -->
    <appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
        <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
            <!--日志文件输出的文件名-->
            <FileNamePattern>/data/trino/data/var/log/ranger.log.%d{yyyy-MM-dd}.log</FileNamePattern>
            <!--日志文件保留天数-->
            <MaxHistory>30</MaxHistory>
        </rollingPolicy>
        <encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
            <!--格式化输出:%d表示日期,%thread表示线程名,%-5level:级别从左显示5个字符宽度%msg:日志消息,%n是换行符-->
            <pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level %logger{50} - %msg%n</pattern>
        </encoder>
        <!--日志文件最大的大小-->
        <triggeringPolicy class="ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy">
            <MaxFileSize>10MB</MaxFileSize>
        </triggeringPolicy>
    </appender>

  <root level="INFO">
    <appender-ref ref="console"/>
   <appender-ref ref="FILE" />

  </root>
</configuration>

2.4 启动Trino

配置完毕,在Trino安装目录下启动Trino Server

 bin/launcher start

如果启动失败,在 data/var/log 目录下查看日志,排查相关原因

3. 权限配置

接下来可以在Ranger中创建用户,配置相关权限进行测试啦

注意Trino的权限需要对每一访问层级进行设置,比如

  • catalog 级别

  • catalog + schema 级别

  • catalog + schema + table 级别

  • 并且需要配置用户可访问information_schema这个trino元数据库,这样才能访问其它的表

权限具体配置步骤可参考该文章:https://towardsdatascience.com/integrating-trino-and-apache-ranger-b808f6b96ad8

如果遇到权限配置导致access denied的问题,也可以去看我们上面在trino-ranger-plugin-logback.xml中配置的trino ranger audit日志,看看具体是哪一步鉴权失败了,在本文中配置的日志路径为 /data/trino/data/var/log/ranger.log

  • 2
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 13
    评论
Trino是一种用于分布式数据处理的开源框架,支持访问各种数据源。Hive是Hadoop生态系统中的一种数据仓库和分析工具,用于处理大规模结构化数据。在使用Trino访问Hive时,可能会需要添加一些自定义的jar包,以便在查询中使用自定义的函数或者数据类型。 要在Trino访问Hive并添加jar包,通常需要按照以下步骤进行操作: 1. 首先需要启动Trino集群,并且确保能够成功连接到Hive。可以通过修改Trino的配置文件来指定Hive的连接参数。 2. 接下来,需要将自定义的jar包上传到Trino的文件系统中。可以使用Trino提供的命令行工具或者API来上传jar包,例如: ```shell trino-client upload my.jar ``` 3. 上传完成后,需要在Trino中添加自定义jar包的路径。可以通过修改Trino的配置文件或者执行SQL语句来添加路径,例如: ```sql SET session hive.customize-hive-session=true; SET session hive.customize-hive.session-config=my_config.xml; ``` 其中,my_config.xml是自定义的Hive配置文件,需要包含自定义jar包的路径等信息。 4. 最后,可以在Trino中通过SQL语句来使用自定义的函数或者数据类型。例如: ```sql SELECT my_function(col1, col2) FROM my_table; ``` 其中,my_function就是自定义的函数名,col1和col2是需要传递给函数的参数。如果自定义的函数或者数据类型没有成功加载,可能会出现错误提示。 总之,在Trino访问Hive并添加jar包时,需要确保掌握基本的Trino和Hive使用方法,并遵循适当的操作流程,才能确保成功完成任务。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 13
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值