java hiveconnector_Datastage 11.5新功能介绍----Hive Connector

最新推荐文章于 2024-08-06 03:01:30 发布

haveuseemywreath

最新推荐文章于 2024-08-06 03:01:30 发布

阅读量215

点赞数

文章标签： java hiveconnector

本文链接：https://blog.csdn.net/weixin_29294597/article/details/115021989

版权

IBM Information Server 11.5.0.1引入了Hive Connector，改进了对Hive的支持。相较于ODBC/JDBC connector，Hive Connector提供对多种Hive版本的支持，包括Cloudera、HortonWorks和BigInsights，并且支持HiveQL运行时生成、分区表读写等Hive特定功能。此外，它还支持源、目标和请求上下文，简化了Hive与ETL作业的集成。然而，Hive Connector存在一些限制，如向分区表的插入操作未实现批量模式。

摘要由CSDN通过智能技术生成

Question

Datastage是否有专门的组件支持Hive呢？

Answer

在最新的IBM Information Server 11.5.0.1中，新增了一些组件和功能，其中比较重要的就是Hive Connector。在之前的版本中，Datastage产品对Hive的支持主要通过ODBC connector或者JDBC connector来实现，使用ODBC / JDBC connector连接Hive存在一些局限性

qConnector的如下的选项不受支持:

oGenerate SQL at Runtime

oIsolation levels

oAuto-commit

oCreate functionality with different file formats

qHive中的partition table不受支持

q在通用的Connector里处理一些Hive特定的功能比较难

q新的Hive Connector较之使用通用的ODBC / JDBC connector有如下优势

q支持多种Hive

Cloudera Hive

Cloudera Impala

HortonWorks

BigInsights

q提供了对Hive一些特定功能的支持

Generation of HiveQL at runtime

Generation of the table DDL specific to Hive

DML generation as per the syntax of HiveQL

Hive specific table formats [AVRO, Parquet, ORC etc]

Partitioned tables

q用户可以使用Generate SQL选项，不必使用 HQL/SQL 语句

q支持partitioned table的读写

q处理Hive特定的功能比较容易

Hive Connector 配置

q底层使用 JDBC 协议，配置相对简单

q使用Datadirect JDBC driver for Hive – 在Information Server安装包默认提供

q创建或修改配置文件isjdbc.config (IS_HOME/Server/DSEngine)

q配置文件中的如下内容用于指定 class path 和driver Java classes:

CLASSPATH=

CLASS_NAMES=

qisjdbc.config示例

CLASSPATH=/opt/IBM/InformationServer/ASBNode/lib/java/IShive.jar;

CLASS_NAMES=com.ibm.isf.jdbc.hive.HiveDriver;

Note : Details about the Kerberos configuration would be provided in the subsequent slides

Hive Connector - Repository View and Palette

Hive Connector 支持Source context / read mode，Target context / write mode，Request context / Lookup mode，下面分别进行介绍

Source context or Read mode

Hive Connector 可以配置成Source context (或 read mode) 用于从Hive中读取数据.

q可以生成 Select 语句

q支持Partitioned Reads

q提供对额外的 Hive 选项的支持

q支持Before or After SQL

q支持 number of rows returned by the stage限制

q支持从文件中读取SQL

Target context or write mode

qHive Connector 可以配置成Target context (或 write mode) 用于向Hive中insert数据. 可以使Hive和ETL job无缝集成

q支持写入partitioned table.

q支持多种 Table action modes [ 包括 Create, Append, Truncate and Replace ]

q目前不支持Update

q在生成Create table语句的时候，支持生成 HiveQL 或HQL 格式的DDL

q支持Insert操作的 Generate SQL

q支持User – defined SQL 选项，可以从文件中读取SQL语句

q支持Before / After SQL statements

q支持额外的 Hive 选项

Request context or Lookup mode

q支持Normal以及 Sparse模式

Hive Connector的使用限制

q目前向 partitioned table中的insert操作是逐条执行的，没有批量模式

q 在 Big Integrate 环境中, Keytab 文件的localization目前不被支持

q 不支持向Hive表中Load数据

workaround:

可以使用 File Connector向Hive表中load数据. 用File connector将数据load到文件，然后在创建一个Hive表关联这个文件

q 只支持Insert的写模式

Hive Connector Troubleshooting

q确保 isjdbc.config配置正确

q在读取partitioned table时，确保 placeholder ([[part-value]]) 正确设置

qHive connector的debugging与其他的connector类似，使用CC_MSG_LEVEL参数可以获取更详细的Hive connector日志ex : CC_MSG_LEVEL=1 / 2

q 其他troubleshooting技巧和问题，请参考

http://www.ibm.com/support/knowledgecenter/SSZJPZ_11.5.0/com.ibm.swg.im.iis.conn.hive.usage.doc/topics/hivecc_troubleshooting.html

其他关于Hive connector的配置，使用介绍，请参考Information Server KnowledgeCenter相关章节

http://www.ibm.com/support/knowledgecenter/SSZJPZ_11.5.0/com.ibm.swg.im.iis.conn.hive.usage.doc/topics/hive_connector_top_of_nav.html

了解更多Information Server11.5的新功能，请参考Information Server 11.5.0.1 Release Notes

http://www-01.ibm.com/support/docview.wss?uid=swg21996106

[{"Product":{"code":"SSZJPZ","label":"InfoSphere Information Server"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"Not Applicable","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF033","label":"Windows"}],"Version":"11.5.0.1;11.5","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

haveuseemywreath

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫