目前看不懂源码,从不同的途径看到的ObjectInspector生成的方式做一下总结,可能还有别的方式:
第一种,来自cloudera官网:
http://blog.cloudera.com/blog/2012/12/how-to-use-a-serde-in-apache-hive/
initalize()方法中,通过Constants的静态属性获得列的所有属性字符串,然后通过 TypeInfoUtils工具类构造出 List<TypeInfo> 的列表,进一步TypeInfoFactory工厂方法将列名称List<String>和 List<TypeInfo>列类型信息转成整行的 TypeInfo,最后通过 TypeInfoUtils工具类将 TypeInfo构造成ObjectInspector
getObjectInspector()方法将上一个初始化方法生成的ObjectInspector返回
@Override
public void initialize(Configuration conf, Properties tbl)
throws SerDeException {
// Get a list of the table's column names.
String colNamesStr = tbl.getProperty(Constants.LIST_COLUMNS);
colNames = Arrays.asList(colNamesStr.split(","));
// Get a list of TypeInfos for the columns. This list lines up with
// the list of column names.
String colTypesStr = tbl.getProperty(Constants.LIST_COLUMN_TYPES);
List<TypeInfo> colTypes = TypeInfoUtils.getTypeInfosFromTypeString(colTypesStr);
rowTypeInfo = (StructTypeInfo) TypeInfoFactory.getStructTypeInfo(colNames, colTypes);
rowOI = TypeInfoUtils.getStandardJavaObjectInspectorFromTypeInfo(rowTypeInfo);
@Override
public ObjectInspector getObjectInspector() throws SerDeException {
return rowOI;
}
第二、三种方法,来自两位大牛的博客:
http://blog.csdn.net/dajuezhao/article/details/5753791
http://www.coder4.com/archives/4031
(1)、先定义List<ObjectInspector>列表,在静态代码块中ObjectInspectorFactory工厂方法将基本类型通过反射构造出对应的ObjectInspector
private static List<String> FieldNames = new ArrayList<String>(); private static List<ObjectInspector> FieldNamesObjectInspectors = new ArrayList<ObjectInspector>(); static { FieldNames.add("time"); FieldNamesObjectInspectors.add(ObjectInspectorFactory .getReflectionObjectInspector(Long.class, ObjectInspectorOptions.JAVA)); FieldNames.add("userid"); FieldNamesObjectInspectors.add(ObjectInspectorFactory .getReflectionObjectInspector(Integer.class, ObjectInspectorOptions.JAVA)); FieldNames.add("host"); FieldNamesObjectInspectors.add(ObjectInspectorFactory .getReflectionObjectInspector(String.class, ObjectInspectorOptions.JAVA)); FieldNames.add("path"); FieldNamesObjectInspectors.add(ObjectInspectorFactory .getReflectionObjectInspector(String.class, ObjectInspectorOptions.JAVA)); }这里大牛将字段名称写死在serde类中在建表时语句就没有定义字段
create table serde_table row format serde 'hive.connect.TestDeserializer';(2)、(为了代码清楚做了裁剪)initalize()方法中通过Constants的静态属性获得列名和列的类型,TypeInfoUtils工具类构造 List<TypeInfo> 的列表,然后遍历列表将值添加到List<ObjectInspector>列表中,最后通过ObjectInspectorFactory构造ObjectInspector
@Override
public void initialize(Configuration conf, Properties tbl) throws SerDeException {
// Read Column Names
String columnNameProp = tbl.getProperty(Constants.LIST_COLUMNS);
columnNames = Arrays.asList(columnNameProp.split(","));
// Read Column Types
String columnTypeProp = tbl.getProperty(Constants.LIST_COLUMN_TYPES);
columnTypes = TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProp);
// Create ObjectInspectors from the type information for each column
List<ObjectInspector> columnOIs = new ArrayList<ObjectInspector>();
ObjectInspector oi;
for (int c = 0; c < columnNames.size(); c++) {
oi = TypeInfoUtils
.getStandardJavaObjectInspectorFromTypeInfo(columnTypes
.get(c));
columnOIs.add(oi);
}
objectInspector = ObjectInspectorFactory
.getStandardStructObjectInspector(columnNames, columnOIs);
}