Hive之ObjectInspector详解

Hive之ObjectInspector详解

对于我而言,我是在写GenericUDF/UDAF/UDTF时候遇到ObjectInspector的。所以这里的内容仅针对函数的时候写的。

我们都知道hql最后会转为MapReduce作业来执行。而我们之前单独写MR的时候,需要写一个Map类和Reduce类,在写这些类的时候我们需要指定输入和输出参数的数据类型(记住不是Java的基本数据类型,还记得吗。是经过Hadoop封装的XxxWritable类型,比如int类型,要写成IntWritable,String类型要写成Text)。因此,ObjectInspector 的作用就是告诉hive输入输出的数据类型(在自定义函数中是在初始化方法中配置的),以便hive将hql转为MR程序。

一、官方解释

Wiki

Hive uses ObjectInspector to analyze the internal structure of the row object and also the structure of the individual columns.(hive 使用 ObjectInspector来分析行对象的内部结构以及各个列的结构

ObjectInspector provides a uniform way to access complex objects that can be stored in multiple formats in the memory, including:

  • Instance of a Java class (Thrift or native Java)
  • A standard Java object (we use java.util.List to represent Struct and Array, and use java.util.Map to represent Map)
  • A lazily-initialized object (for example, a Struct of string fields stored in a single Java string object with starting offset for each field)

A complex object can be represented by a pair of ObjectInspector and Java Object. The ObjectInspector not only tells us the structure of the Object, but also gives us ways to access the internal fields inside the Object.一个复杂的对象可以由一对ObjectInspector和Java Object表示。 ObjectInspector不仅告诉我们对象的结构,而且还提供了访问对象内部字段的方法。(下面看接口源码的时候也可以看到这种类型和实例分离的结构,ObjectInspector只记录类型并且可以直接返回,另外提供了一个获取实例的方法,该方法的参数是一个Object对象,即本身不存储具体的数据,而是根据传入的对象,利用自己的类型来转换成具有类型的对象)

NOTE: Apache Hive recommends that custom ObjectInspectors created for use with custom SerDes have a no-argument constructor in addition to their normal constructors for serialization purposes. See HIVE-5380 for more details.

JAVA API DOC

ObjectInspector helps us to look into the internal structure of a complex object. A (probably configured) ObjectInspector instance stands for a specific type and a specific way to store the data of that type in the memory. For native java Object, we can directly access the internal structure through member fields and methods. ObjectInspector is a way to delegate that functionality away from the Object, so that we have more control on the behavior of those actions. An efficient implementation of ObjectInspector should rely on factory, so that we can make sure the same ObjectInspector only has one instance. That also makes sure hashCode() and equals() methods of java.lang.Object directly works for ObjectInspector as well.

ObjectInspector帮助我们研究复杂对象的内部结构。一个(可能已配置的)ObjectInspector实例代表一种特定的类型和一种将该类型的数据存储在内存中的特定方式。对于本机Java对象,我们可以通过成员字段和方法直接访问内部结构。 ObjectInspector是一种将功能委托给Object的方法,这样我们就可以更好地控制这些动作的行为。 ObjectInspector的有效实现应依赖工厂,以便我们可以确保同一ObjectInspector仅具有一个实例。这也可以确保java.lang.Object的hashCode()和equals()方法也直接适用于ObjectInspector。

二、关系网

源码中还有好多接口,这里只列出我在写自定义函数的时候见到的。了解这些接口以及对应的实现类,有助于我们理解。

2.1 ObjectInspector 接口

public interface ObjectInspector extends Cloneable {
    String getTypeName();

    ObjectInspector.Category getCategory();

    // 其中 PRIMITIVE 又细分 PrimitiveCategory 枚举类型对应的值
    public static enum Category {
        PRIMITIVE,	// 原始数据类型
        LIST,
        MAP,
        STRUCT,
        UNION;

        private Category() {
        }
    }
}
2.1.1 ListObjectInspector 接口

主要内容:

  • 获取List中元素的对象检查器
  • 获取List指定下标的元素的对象实例
  • 获取List的长度
  • 获取List实例(该方法只应该在,如果List对象是Object数据的一部分,的时候使用)
package org.apache.hadoop.hive.serde2.objectinspector;

import org.apache.hadoop.hive.common.classification.InterfaceAudience;
import org.apache.hadoop.hive.common.classification.InterfaceStability;

import java.util.List;

/**
 * ListObjectInspector.
 *
 */
@InterfaceAudience.Public
@InterfaceStability.Stable
public interface ListObjectInspector extends ObjectInspector {

  // ** Methods that does not need a data object **
  ObjectInspector getListElementObjectInspector();

  // ** Methods that need a data object **
  /**
   * returns null for null list, out-of-the-range index.
   */
  Object getListElement(Object data, int index);

  /**
   * returns -1 for data = null.
   */
  int getListLength(Object data);

  /**
   * returns null for data = null.
   * 
   * Note: This method should not return a List object that is reused by the
   * same ListObjectInspector, because it's possible that the same
   * ListObjectInspector will be used in multiple places in the code.
   * 
   * However it's OK if the List object is part of the Object data.
   */
  List<?> getList(Object data);

}
StandardListObjectInspector 实现类

重点:使用以下方式来创建List对象检查器:

ObjectInspectorFactory.getStandardListObjectInspector(ObjectInspector listElementObjectInspector))

源码:

package org.apache.hadoop.hive.serde2.objectinspector;

import java.util.ArrayList;
import java.util.List;
import java.util.Set;

/**
 * DefaultListObjectInspector works on list data that is stored as a Java List
 * or Java Array object.
 *
 * 默认的List对象检查器,在存储数据的Java List 或者 Java Array上工作。
 *
 * Always use the ObjectInspectorFactory to create new ObjectInspector objects,
 * instead of directly creating an instance of this class.
 *
 * 总是用 ObjectInspectorFactory 来创建一个新的 ObjectInspector 对象,而不是直接 new 该对象。
 */
public class StandardListObjectInspector implements SettableListObjectInspector {

  // 内部元素的对象检查器
  private ObjectInspector listElementObjectInspector;

  protected StandardListObjectInspector() {
    super();
  }
  /**
   * Call ObjectInspectorFactory.getStandardListObjectInspector instead.
   *
   * 使用 “ObjectInspectorFactory.getStandardListObjectInspector” 来代替
   */
  protected StandardListObjectInspector(
      ObjectInspector listElementObjectInspector) {
    this.listElementObjectInspector = listElementObjectInspector;
  }

  // 返回的是List类别
  public final Category getCategory() {
    return Category.LIST;
  }

  // without data 返回对象检查器
  public ObjectInspector getListElementObjectInspector() {
    return listElementObjectInspector;
  }

  // with data 返回对象实例
  @SuppressWarnings({ "rawtypes", "unchecked" })
  public Object getListElement(Object data, int index) {
    if (data == null) {
      return null;
    }
    // We support List<Object>, Set<Object> and Object[] 我们支持3种数据类型
    // so we have to do differently. 因此,不得不进行不同的判断处理
    // 如果data不能转为list,除了set和array其他的就不满足了
    if (! (data instanceof List)) {
      // set的情况
      if (! (data instanceof Set)) {
        Object[] list = (Object[]) data;
        if (index < 0 || index >= list.length) {
          return null;
        }
        return list[index];
      } else {
        // array的情况
        data = new ArrayList((Set<?>) data);
      }
    }
    List<?> list = (List<?>) data;
    if (index < 0 || index >= list.size()) {
      return null;
    }
    return list.get(index);
  }

  public int getListLength(Object data) {
    if (data == null) {
      return -1;
    }
    // We support List<Object>, Set<Object> and Object[]
    // so we have to do differently.
    if (! (data instanceof List)) {
      if (! (data instanceof Set)) {
        Object[] list = (Object[]) data;
        return list.length;
      } else {
        Set<?> set = (Set<?>) data;
        return set.size();
      }
    } else {
      List<?> list = (List<?>) data;
      return list.size();
    }
  }

  @SuppressWarnings({ "rawtypes", "unchecked" })
  public List<?> getList(Object data) {
    if (data == null) {
      return null;
    }
    // We support List<Object>, Set<Object> and Object[]
    // so we have to do differently.
    if (! (data instanceof List)) {
      if (! (data instanceof Set)) {
        data = java.util.Arrays.asList((Object[]) data);
      } else {
        data = new ArrayList((Set<?>) data);
      }
    }
    List<?> list = (List<?>) data;
    return list;
  }

  public String getTypeName() {
    // return array<...>
    return org.apache.hadoop.hive.serde.serdeConstants.LIST_TYPE_NAME + "<"
        + listElementObjectInspector.getTypeName() + ">";
  }

  // /
  // SettableListObjectInspector
  @Override
  public Object create(int size) {
    List<Object> a = new ArrayList<Object>(size);
    for (int i = 0; i < size; i++) {
      a.add(null);
    }
    return a;
  }

  @Override
  public Object resize(Object list, int newSize) {
    List<Object> a = (List<Object>) list;
    while (a.size() < newSize) {
      a.add(null);
    }
    while (a.size() > newSize) {
      a.remove(a.size() - 1);
    }
    return a;
  }

  @Override
  public Object set(Object list, int index, Object element) {
    List<Object> a = (List<Object>) list;
    a.set(index, element);
    return a;
  }

}
2.1.2 MapObjectInspector 接口

主要内容:

  • 分别获取Key和Value的对象检查器ObjectInspector
  • 获取指定key的value的元素实例
  • 获取map实例(该方法只应该在,如果Map对象是Object数据的一部分,的时候使用)
  • 获取map大小
package org.apache.hadoop.hive.serde2.objectinspector;

import org.apache.hadoop.hive.common.classification.InterfaceAudience;
import org.apache.hadoop.hive.common.classification.InterfaceStability;

import java.util.Map;

/**
 * MapObjectInspector.
 *
 */
@InterfaceAudience.Public
@InterfaceStability.Stable
public interface MapObjectInspector extends ObjectInspector {

  // ** Methods that does not need a data object **
  // Map Type
  ObjectInspector getMapKeyObjectInspector();

  ObjectInspector getMapValueObjectInspector();

  // ** Methods that need a data object **
  // In this function, key has to be of the same structure as the Map expects.
  // Most cases key will be primitive type, so it's OK.
  // In rare cases that key is not primitive, the user is responsible for
  // defining
  // the hashCode() and equals() methods of the key class.
  Object getMapValueElement(Object data, Object key);

  /**
   * returns null for data = null.
   * 
   * Note: This method should not return a Map object that is reused by the same
   * MapObjectInspector, because it's possible that the same MapObjectInspector
   * will be used in multiple places in the code.
   * 
   * However it's OK if the Map object is part of the Object data.
   */
  Map<?, ?> getMap(Object data);

  /**
   * returns -1 for NULL map.
   */
  int getMapSize(Object data);
}
StandardMapObjectInspector 实现类

重点:始终使用ObjectInspectorFactory创建新的ObjectInspector对象,而不是直接创建此类的实例。

ObjectInspectorFactory.getStandardMapObjectInspector

代码:

package org.apache.hadoop.hive.serde2.objectinspector;

import java.util.LinkedHashMap;
import java.util.Map;

/**
 * StandardMapObjectInspector works on map data that is stored as a Java Map
 * object. Note: the key object of the map must support equals and hashCode by
 * itself.
 *
 * StandardMapObjectInspector 基于 Java Map 对象工作,注意的是,map的key必须支持equals和hashCode
 *
 * We also plan to have a GeneralMapObjectInspector which can work on map with
 * key objects that does not support equals and hashCode. That will require us
 * to store InspectableObject as the key, which will have overridden equals and
 * hashCode methods.
 *
 * Always use the ObjectInspectorFactory to create new ObjectInspector objects,
 * instead of directly creating an instance of this class.
 *
 * 始终使用ObjectInspectorFactory创建新的ObjectInspector对象,而不是直接创建此类的实例。
 */
public class StandardMapObjectInspector implements SettableMapObjectInspector {

  // key和value的对象检查器
  private ObjectInspector mapKeyObjectInspector;
  private ObjectInspector mapValueObjectInspector;

  protected StandardMapObjectInspector() {
    super();
  }
  /**
   * Call ObjectInspectorFactory.getStandardMapObjectInspector instead.
   */
  protected StandardMapObjectInspector(ObjectInspector mapKeyObjectInspector,
      ObjectInspector mapValueObjectInspector) {
    this.mapKeyObjectInspector = mapKeyObjectInspector;
    this.mapValueObjectInspector = mapValueObjectInspector;
  }

  // without data  返回key/value的对象检查器
  public ObjectInspector getMapKeyObjectInspector() {
    return mapKeyObjectInspector;
  }

  public ObjectInspector getMapValueObjectInspector() {
    return mapValueObjectInspector;
  }

  // with data 返回指定key对应的value的对象实例
  // TODO: Now we assume the key Object supports hashCode and equals functions.
  public Object getMapValueElement(Object data, Object key) {
    if (data == null || key == null) {
      return null;
    }
    Map<?, ?> map = (Map<?, ?>) data;
    return map.get(key);
  }

  public int getMapSize(Object data) {
    if (data == null) {
      return -1;
    }
    Map<?, ?> map = (Map<?, ?>) data;
    return map.size();
  }

  // 将data转为map实例返回
  public Map<?, ?> getMap(Object data) {
    if (data == null) {
      return null;
    }
    Map<?, ?> map = (Map<?, ?>) data;
    return map;
  }

  public final Category getCategory() {
    return Category.MAP;
  }

  public String getTypeName() {
    return org.apache.hadoop.hive.serde.serdeConstants.MAP_TYPE_NAME + "<"
        + mapKeyObjectInspector.getTypeName() + ","
        + mapValueObjectInspector.getTypeName() + ">";
  }

  // /
  // SettableMapObjectInspector
  @Override
  public Object create() {
    Map<Object, Object> m = new LinkedHashMap<Object, Object>();
    return m;
  }

  @Override
  public Object clear(Object map) {
    Map<Object, Object> m = (Map<Object, Object>) map;
    m.clear();
    return m;
  }

  @Override
  public Object put(Object map, Object key, Object value) {
    Map<Object, Object> m = (Map<Object, Object>) map;
    m.put(key, value);
    return m;
  }

  @Override
  public Object remove(Object map, Object key) {
    Map<Object, Object> m = (Map<Object, Object>) map;
    m.remove(key);
    return m;
  }

}
2.1.3 PrimitiveObjectInspector 接口

主要的内容:

  • 原始数据类别
  • 获取类别以及类型
  • 根据传入对象,获取对应的Writable(Hadoop的MR中输入输出参数支持的类型)对象的类(Class)和实例(Object)
  • 同上,只不过是获取Java的基本数据类和实例
package org.apache.hadoop.hive.serde2.objectinspector;

import org.apache.hadoop.hive.common.classification.InterfaceAudience;
import org.apache.hadoop.hive.common.classification.InterfaceStability;
import org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo;


/**
 * PrimitiveObjectInspector.
 *
 */
@InterfaceAudience.Public
@InterfaceStability.Stable
public interface PrimitiveObjectInspector extends ObjectInspector {

  /**
   * hive支持的原始数据类型(这个枚举重要!)
   */
  enum PrimitiveCategory {
    VOID, BOOLEAN, BYTE, SHORT, INT, LONG, FLOAT, DOUBLE, STRING,
    DATE, TIMESTAMP, TIMESTAMPLOCALTZ, BINARY, DECIMAL, VARCHAR, CHAR,
    INTERVAL_YEAR_MONTH, INTERVAL_DAY_TIME, UNKNOWN
  }

  PrimitiveTypeInfo getTypeInfo();

  /**
   * Get the primitive category of the PrimitiveObjectInspector.
   */
  PrimitiveCategory getPrimitiveCategory();

  /**
   * 获取Primitive Writable类(Hadoop写MR的时候输入输出类型都要是XxxWritable)
   */
  Class<?> getPrimitiveWritableClass();

  /**
   * 返回可以转换成primitive writable Object o,如果这个o已经是writable的,直接返回,否则转为writable后再返回
   */
  Object getPrimitiveWritableObject(Object o);

  /**
   * 获取Java原始数据类
   */
  Class<?> getJavaPrimitiveClass();

  /**
   * 获取Java原始数据实例
   */
  Object getPrimitiveJavaObject(Object o);

  /**
   * Get a copy of the Object in the same class, so the return value can be
   * stored independently of the parameter.
   *
   * If the Object is a Primitive Java Object, we just return the parameter
   * since Primitive Java Object is immutable.
   */
  Object copyObject(Object o);

  /**
   * Whether the ObjectInspector prefers to return a Primitive Writable Object
   * instead of a Primitive Java Object. This can be useful for determining the
   * most efficient way to getting data out of the Object.
   */
  boolean preferWritable();

  /**
   * The precision of the underlying data.
   */
  int precision();

  /**
   * The scale of the underlying data.
   */
  int scale();

}

更细分原始数据类型接口

原始数据类型有点麻烦。以下是某一个例子的关系图,一句话说就是:原始类型,分为Java基本数据类型,和Writable类型:(即PrimitiveObjectInspector 还细分为 XxxObjectInspector)

hive-objectInspector-primitive.png

TypeInfo 接口(和之前的讲诉内容关联不大,看看就好)
package org.apache.hadoop.hive.serde2.typeinfo;

import java.io.Serializable;

import org.apache.hadoop.hive.common.classification.InterfaceAudience;
import org.apache.hadoop.hive.common.classification.InterfaceStability;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector.Category;

/**
 * Stores information about a type. Always use the TypeInfoFactory to create new
 * TypeInfo objects.
 * 存储有关类型的信息。始终使用TypeInfoFactory创建新的TypeInfo对象。
 *
 * We support 8 categories of types:
 * 1. Primitive objects (String, Number, etc)
 * 2. List objects (a list of objects of a single type)
 * 3. Map objects (a map from objects of one type to objects of another type)
 * 4. Struct objects (a list of fields with names and their own types)
 * 5. Union objects
 * 6. Decimal objects
 * 7. Char objects
 * 8. Varchar objects
 */
@InterfaceAudience.Public
@InterfaceStability.Stable
public abstract class TypeInfo implements Serializable {

  private static final long serialVersionUID = 1L;

  protected TypeInfo() {
  }

  /**
   * The Category of this TypeInfo. Possible values are Primitive, List, Map,
   * Struct and Union, which corresponds to the 5 sub-classes of TypeInfo.
   * 此TypeInfo的类别。可能的值是Primitive,List,Map,Struct和Union,它们对应于TypeInfo的5个子类。
   */
  public abstract Category getCategory();

  /**
   * A String representation of the TypeInfo.
   * TypeInfo的String表示形式。
   */
  public abstract String getTypeName();

  /**
   * String representing the qualified type name.
   * Qualified types should override this method.
   * @return
   */
  public String getQualifiedName() {
    return getTypeName();
  }

  @Override
  public String toString() {
    return getTypeName();
  }

  @Override
  public abstract boolean equals(Object o);

  @Override
  public abstract int hashCode();

  public boolean accept(TypeInfo other) {
    return this.equals(other);
  }

}

2.1.4 StructObjectInspector 抽象类
StructField
public interface StructField {
  /**
   * Get the name of the field. The name should be always in lower-case.
   * 获取字段名称。名称应始终小写。
   */
  String getFieldName();

  /**
   * Get the ObjectInspector for the field.
   * 获取该字段的ObjectInspector。
   */
  ObjectInspector getFieldObjectInspector();

  /**
   * Get the fieldID for the field.
   */
  int getFieldID();

  /**
   * Get the comment for the field. May be null if no comment provided.
   */
  String getFieldComment();
}

StructObjectInspector
package org.apache.hadoop.hive.serde2.objectinspector;

import org.apache.hadoop.hive.common.classification.InterfaceAudience;
import org.apache.hadoop.hive.common.classification.InterfaceStability;

import java.util.List;

/**
 * StructObjectInspector.
 *
 */
@InterfaceAudience.Public
@InterfaceStability.Stable
public abstract class StructObjectInspector implements ObjectInspector {

  // ** Methods that does not need a data object **
  // 方法不需要数据对象,StructField只有字段名称和字段类型
  /**
   * Returns all the fields.
   * 返回所有字段
   */
  public abstract List<? extends StructField> getAllStructFieldRefs();

  /**
   * Look up a field.
   * 根据字段名称,获取字段
   */
  public abstract StructField getStructFieldRef(String fieldName);

  // ** Methods that need a data object **
  /**
   * returns null for data = null.
   * 根据数据对象以及字段,来获取该字段对象实例
   */
  public abstract Object getStructFieldData(Object data, StructField fieldRef);

  /**
   * returns null for data = null.
   */
  public abstract List<Object> getStructFieldsDataAsList(Object data);

  public boolean isSettable() {
    return false;
  }

  @Override
  public String toString() {
    StringBuilder sb = new StringBuilder();
    List<? extends StructField> fields = getAllStructFieldRefs();
    sb.append(getClass().getName());
    sb.append("<");
    for (int i = 0; i < fields.size(); i++) {
      if (i > 0) {
        sb.append(",");
      }
      sb.append(fields.get(i).getFieldObjectInspector().toString());
    }
    sb.append(">");
    return sb.toString();
  }
}

2.2 ConstantObjectInspector 常量对象检查器

/**
 * ConstantObjectInspector.  This interface should be implemented by
 * ObjectInspectors which represent constant values and can return them without
 * an evaluation.
 * 该接口应由ObjectInspectors实现,这些对象代表常数,并且可以在不进行评估的情况下返回它们。
 */
@InterfaceAudience.Public
@InterfaceStability.Stable
public interface ConstantObjectInspector extends ObjectInspector {
  Object getWritableConstantValue();
}

这个是在自定义UDTF的时候看到的,书上原话是:

SELECT forx(1, 5) AS i FROM collecttest;

因为本函数的输入参数都是常数(我觉得SQL中手写的就是常量,因为我找到了WritableConstantStringObjectInspector 这样一个类,由查询语句返回的结果就不是常量的),所以在初始化的initialize方法中就可以确定各个变量的值了。对于本函数,非常量数据是无法到达evaluate方法进行处理的(即可以在init方法的时候就能够获取到值)。

然后使用了如下方法:

IntWritable start;
start = ((WritableConstantIntObjectInspector) arg[0]).getWritableConstantValue();

三、Factory 和 Utils

在研究的过程中经常看到两个类:我看了以下他们的方法,总结了以下内容:

3.1 ObjectInspectorFactory

ObjectInspectorFactory是创建新的ObjectInspector实例的主要方法一般用于创建集合数据类型

List(参数是内部元素的对象检查器)

public static StandardListObjectInspector getStandardListObjectInspector(
    ObjectInspector listElementObjectInspector)

Map(参数是内部元素的,键/值对象检查器)

public static StandardMapObjectInspector getStandardMapObjectInspector(
    ObjectInspector mapKeyObjectInspector,
    ObjectInspector mapValueObjectInspector)

Struct(参数是内部字段的,字段名字符串列表,以及对应的对象检查器列表。还方法还有一个重载的方法,第三个参数是可选的,是一个List<?> value 表示对应字段的值)【我是在UDTF返回类型时看到】

public static StandardStructObjectInspector getStandardStructObjectInspector(
    List<String> structFieldNames,
    List<ObjectInspector> structFieldObjectInspectors) 

3.2 ObjectInspectorUtils

工具类:一般用于将已有的数据类型转换为标准数据类型

getStandardObjectInspector

根据传入的对象类型,获取标准对象类型

/**
   * Get the corresponding standard ObjectInspector for an ObjectInspector.
   * 获取ObjectInspector的相应标准ObjectInspector。
   *
   * The returned ObjectInspector can be used to inspect the standard object.
   * 返回的ObjectInspector可用于检查标准对象。
   */
public static ObjectInspector getStandardObjectInspector(ObjectInspector oi) {
    return getStandardObjectInspector(oi, ObjectInspectorCopyOption.DEFAULT);
}

copyToStandardObject

即,对象数据实例 + 标准对象类型 = 对象实例

/**
   * Returns a deep copy of the Object o that can be scanned by a
   * StandardObjectInspector returned by getStandardObjectInspector(oi).
   * 
   * 返回对象o的深层副本,该副本可以由getStandardObjectInspector(oi)返回的StandardObjectInspector进行扫描。
   */
public static Object copyToStandardObject(Object o, ObjectInspector oi) {
    return copyToStandardObject(o, oi, ObjectInspectorCopyOption.DEFAULT);
}

3.3 PrimitiveObjectInspectorFactory

PrimitiveObjectInspectorFactory是创建新的PrimitiveObjectInspector实例的主要方法一般用于创建原始数据类型。

getPrimitiveJavaObjectInspector

/**
   * Returns the PrimitiveJavaObjectInspector for the PrimitiveCategory.
   * 返回PrimitiveCategory的PrimitiveJavaObjectInspector。
   *
   * @param primitiveCategory input to be looked up.
   */
public static AbstractPrimitiveJavaObjectInspector getPrimitiveJavaObjectInspector(
    PrimitiveCategory primitiveCategory);

与之对应的Writable类型:

getPrimitiveWritableObjectInspector

/**
   * Returns the PrimitiveWritableObjectInspector for the PrimitiveCategory.
   *
   * @param primitiveCategory primitive category input to be looked up.
   */
  public static AbstractPrimitiveWritableObjectInspector getPrimitiveWritableObjectInspector(
      PrimitiveCategory primitiveCategory);

  • 9
    点赞
  • 26
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
Hive是基于Hadoop的数据仓库工具,它可以将结构化数据映射为一张数据库表,并提供类似SQL语言的查询功能,使得数据分析师和开发人员可以使用SQL语言来查询和分析大规模的数据。下面是Hive的安装与配置详解: 1. 安装Java Hive需要Java环境来运行,所以需要先安装Java。可以通过以下命令来安装Java: ``` sudo apt-get update sudo apt-get install default-jdk ``` 2. 安装Hadoop Hive是基于Hadoop的,所以需要先安装Hadoop。可以参考Hadoop的安装与配置教程。 3. 下载Hive 可以从Hive的官方网站下载最新的版本,也可以从Apache的镜像站点下载。下载完成后,解压缩到指定目录,比如/opt/hive。 4. 配置Hive 配置文件位于Hive的conf目录下,修改hive-env.sh文件,设置JAVA_HOME和HADOOP_HOME变量的值,比如: ``` export JAVA_HOME=/usr/lib/jvm/default-java export HADOOP_HOME=/opt/hadoop ``` 另外,还需要修改hive-site.xml文件,将以下属性设置为对应的值: ``` <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:derby:/opt/hive/metastore_db;create=true</value> <description>JDBC connect string for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>org.apache.derby.jdbc.EmbeddedDriver</value> <description>Driver class name for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> <description>username to use against metastore database</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>hive</value> <description>password to use against metastore database</description> </property> ``` 5. 启动Hive 启动Hive之前,需要先启动Hadoop。启动Hadoop后,可以通过以下命令启动Hive: ``` cd /opt/hive/bin ./hive ``` 启动成功后,可以在Hive的Shell中输入SQL语句,比如: ``` hive> show tables; ``` 以上就是Hive的安装与配置详解

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值