Hive之ObjectInspector详解

对于我而言,我是在写GenericUDF/UDAF/UDTF时候遇到ObjectInspector的。所以这里的内容仅针对函数的时候写的。

我们都知道hql最后会转为MapReduce作业来执行。而我们之前单独写MR的时候,需要写一个Map类和Reduce类,在写这些类的时候我们需要指定输入和输出参数的数据类型(记住不是Java的基本数据类型,还记得吗。是经过Hadoop封装的XxxWritable类型,比如int类型,要写成IntWritable,String类型要写成Text)。因此,ObjectInspector 的作用就是告诉hive输入输出的数据类型(在自定义函数中是在初始化方法中配置的),以便hive将hql转为MR程序。

一、官方解释
Wiki

Hive uses ObjectInspector to analyze the internal structure of the row object and also the structure of the individual columns.(hive 使用 ObjectInspector来分析行对象的内部结构以及各个列的结构)

ObjectInspector provides a uniform way to access complex objects that can be stored in multiple formats in the memory, including:

Instance of a Java class (Thrift or native Java)
A standard Java object (we use java.util.List to represent Struct and Array, and use java.util.Map to represent Map)
A lazily-initialized object (for example, a Struct of string fields stored in a single Java string object with starting offset for each field)
A complex object can be represented by a pair of ObjectInspector and Java Object. The ObjectInspector not only tells us the structure of the Object, but also gives us ways to access the internal fields inside the Object.一个复杂的对象可以由一对ObjectInspector和Java Object表示。 ObjectInspector不仅告诉我们对象的结构,而且还提供了访问对象内部字段的方法。(下面看接口源码的时候也可以看到这种类型和实例分离的结构,ObjectInspector只记录类型并且可以直接返回,另外提供了一个获取实例的方法,该方法的参数是一个Object对象,即本身不存储具体的数据,而是根据传入的对象,利用自己的类型来转换成具有类型的对象)

NOTE: Apache Hive recommends that custom ObjectInspectors created for use with custom SerDes have a no-argument constructor in addition to their normal constructors for serialization purposes. See HIVE-5380 for more details.

JAVA API DOC

ObjectInspector helps us to look into the internal structure of a complex object. A (probably configured) ObjectInspector instance stands for a specific type and a specific way to store the data of that type in the memory. For native java Object, we can directly access the internal structure through member fields and methods. ObjectInspector is a way to delegate that functionality away from the Object, so that we have more control on the behavior of those actions. An efficient implementation of ObjectInspector should rely on factory, so that we can make sure the same ObjectInspector only has one instance. That also makes sure hashCode() and equals() methods of java.lang.Object directly works for ObjectInspector as well.

ObjectInspector帮助我们研究复杂对象的内部结构。一个(可能已配置的)ObjectInspector实例代表一种特定的类型和一种将该类型的数据存储在内存中的特定方式。对于本机Java对象,我们可以通过成员字段和方法直接访问内部结构。 ObjectInspector是一种将功能委托给Object的方法,这样我们就可以更好地控制这些动作的行为。 ObjectInspector的有效实现应依赖工厂,以便我们可以确保同一ObjectInspector仅具有一个实例。这也可以确保java.lang.Object的hashCode()和equals()方法也直接适用于ObjectInspector。


————————————————
版权声明:本文为CSDN博主「IT小王404」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/weixin_42167895/article/details/108314139

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值