Hive 自定义SerDe时生成ObjectInspector三种方式

目前看不懂源码,从不同的途径看到的ObjectInspector生成的方式做一下总结,可能还有别的方式:

第一种,来自cloudera官网:

http://blog.cloudera.com/blog/2012/12/how-to-use-a-serde-in-apache-hive/

initalize()方法中,通过Constants的静态属性获得列的所有属性字符串,然后通过 TypeInfoUtils工具类构造出 List<TypeInfo> 的列表,进一步TypeInfoFactory工厂方法将列名称List<String>和 List<TypeInfo>列类型信息转成整行的 TypeInfo,最后通过 TypeInfoUtils工具类将 TypeInfo构造成ObjectInspector

getObjectInspector()方法将上一个初始化方法生成的ObjectInspector返回

@Override
 public void initialize(Configuration conf, Properties tbl)
     throws SerDeException {
   // Get a list of the table's column names.
   String colNamesStr = tbl.getProperty(Constants.LIST_COLUMNS);
   colNames = Arrays.asList(colNamesStr.split(","));
  
   // Get a list of TypeInfos for the columns. This list lines up with
   // the list of column names.
   String colTypesStr = tbl.getProperty(Constants.LIST_COLUMN_TYPES);


   List<TypeInfo> colTypes = TypeInfoUtils.getTypeInfosFromTypeString(colTypesStr);
  
   rowTypeInfo = (StructTypeInfo) TypeInfoFactory.getStructTypeInfo(colNames, colTypes);
   rowOI = TypeInfoUtils.getStandardJavaObjectInspectorFromTypeInfo(rowTypeInfo);


@Override
 public ObjectInspector getObjectInspector() throws SerDeException {
   return rowOI;
 }


第二、三种方法,来自两位大牛的博客:

http://blog.csdn.net/dajuezhao/article/details/5753791

http://www.coder4.com/archives/4031

(1)、先定义List<ObjectInspector>列表,在静态代码块中ObjectInspectorFactory工厂方法将基本类型通过反射构造出对应的ObjectInspector

   private static List<String> FieldNames = new ArrayList<String>();
   private static List<ObjectInspector> FieldNamesObjectInspectors =  new ArrayList<ObjectInspector>();
   static {
     FieldNames.add("time");
     FieldNamesObjectInspectors.add(ObjectInspectorFactory
          .getReflectionObjectInspector(Long.class,
               ObjectInspectorOptions.JAVA));
     FieldNames.add("userid");
     FieldNamesObjectInspectors.add(ObjectInspectorFactory
          .getReflectionObjectInspector(Integer.class,
               ObjectInspectorOptions.JAVA));
     FieldNames.add("host");
     FieldNamesObjectInspectors.add(ObjectInspectorFactory
          .getReflectionObjectInspector(String.class,
               ObjectInspectorOptions.JAVA));

     FieldNames.add("path");
     FieldNamesObjectInspectors.add(ObjectInspectorFactory
          .getReflectionObjectInspector(String.class,
               ObjectInspectorOptions.JAVA));

   }
这里大牛将字段名称写死在serde类中在建表时语句就没有定义字段

create table serde_table row format serde 'hive.connect.TestDeserializer';
(2)、(为了代码清楚做了裁剪)initalize()方法中通过Constants的静态属性获得列名和列的类型,TypeInfoUtils工具类构造 List<TypeInfo> 的列表,然后遍历列表将值添加到List<ObjectInspector>列表中,最后通过ObjectInspectorFactory构造ObjectInspector

@Override
public void initialize(Configuration conf, Properties tbl) throws SerDeException {

// Read Column Names
String columnNameProp = tbl.getProperty(Constants.LIST_COLUMNS);

columnNames = Arrays.asList(columnNameProp.split(","));


// Read Column Types
String columnTypeProp = tbl.getProperty(Constants.LIST_COLUMN_TYPES);

columnTypes = TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProp);

// Create ObjectInspectors from the type information for each column
List<ObjectInspector> columnOIs = new ArrayList<ObjectInspector>();
ObjectInspector oi;
for (int c = 0; c < columnNames.size(); c++) {
oi = TypeInfoUtils
.getStandardJavaObjectInspectorFromTypeInfo(columnTypes
.get(c));
columnOIs.add(oi);
}
objectInspector = ObjectInspectorFactory
.getStandardStructObjectInspector(columnNames, columnOIs);


}

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值