Impala 4.0 启用 LZO

Impala 4.0 中移除了 Impala-lzo 的直接支持,下面是 “Impala 4 Breaking Changes” 邮件中关于移除 Impala-lzo 支持的说明及 JIRA,IMPALA-9709 追踪了 Impala-lzo 的移除。

Remove support for Impala-lzo:
Impala-lzo provides code to allow Impala to read the LZO compressed tables.
LZO is GPL licensed, which is why this support is not included directly.
The Impala-lzo code interacts with internal Impala code at a level that is
error prone and intricate. Given the low adoption of LZO and the other
compression options available, Impala plans to remove Impala-lzo support
along with the low level interface it used.

由于数仓中有很大一部分原始数据是用 LZO 存储的,转换的成本比较高,本文在 Impala 4.0 上尝试安装支持 LZO。

分析

通过查阅 Jira,目前只是从开发环境中删除 Impala-lzo,插件基础结构并未删除,这保留了一些 LZO支持代码。可以使用 Impala-lzo 其作为插件加载并获得与以前相同的功能。

参数 enabled_hdfs_text_scanner_plugins 现在是空,之前是 LZO。

// LZO is no longer supported, so there are no plugins enabled by default. This is
// likely to be removed.
DEFINE_string(enabled_hdfs_text_scanner_plugins, "", "(Advanced) whitelist of HDFS "
    "text scanner plugins that Impala will try to dynamically load. Must be a "
    "comma-separated list of upper-case compression codec names. Each plugin implements "
    "support for decompression and hands off the decompressed bytes to Impala's builtin "
    "text parser for further processing (e.g. parsing delimited text).");

// 插件库模板,lzo插件应该是 libimpalalzo.so
static const string LIB_IMPALA_TEMPLATE = "libimpala$0.so";

从这里可以得知想要支持 lzo 查询,需要配置enabled_hdfs_text_scanner_plugins选项和libimpalalzo.so库。

环境

OS: CentOS 7

Impala: https://github.com/apache/impala/tree/branch-4.0.0

Impala-lzo: https://github.com/chufucun/impala-lzo 注:修复了和 Impala 4.0 编译的错误。

CM & CDH: 6.3

构建 Impala-lzo

要构建 impala-lzo 库,请先准备好 Impala 的编译环境,impala-lzo 依赖这个项目的编译环境。

Building Impala

  • Building Impala without Test Data (for testing Impala)
# 可以使用 --depth 1  参数加快克隆。
git clone -b branch-4.0.0 https://gitbox.apache.org/repos/asf/impala.git ~/Impala
cd ~/Impala
export IMPALA_HOME=`pwd`
./bin/bootstrap_system.sh
source ./bin/impala-config.sh
# Format the test cluster and start Impala and de
  • 4
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 6
    评论
评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值