HIVE UDF JAVA开发

最新推荐文章于 2023-11-01 17:19:11 发布

hxtog

最新推荐文章于 2023-11-01 17:19:11 发布

阅读量582

点赞数

分类专栏：大数据开发 hive编程文章标签： hive udf 大数据 java

本文链接：https://blog.csdn.net/hxtog/article/details/107722584

版权

本文介绍了如何在Hive中开发用户定义函数(UDF)。内容包括：1) 继承GenericUDF并实现initialize、evaluate和getDisplayString方法；2) initialize方法用于参数校验和指定输出类型，evaluate方法处理输入数据生成输出，getDisplayString在异常时显示信息；3) 使用Description注解进行函数说明；4) 提供代码实例展示UDF开发过程；5) 通过ExtendsGenericUDF封装简化部分自动实现功能。

摘要由CSDN通过智能技术生成

5. 使用ExtendsGenericUDF封装GenericUDF实现部分自动义功能

6. 后记

1. Hive UDF编程

继承GenericUDF类
实现三个方法(initialize, evaluate, 和getDisplayString)
Description注解

2. 需实现的三个方法说明

initialize

方法说明: 用于校验输入参数类型, 指定输出结果类型.

运行: 在单节点运行周期中内执行一次, 且是在最开始执行.

evaluate

方法说明: 处理输入内容生成输出结果.

运行: 在单节点运行周期中内执行多次, 执行次数与数据行数相等.

getDisplayString

方法说明: 异常退出时输出标识内容.

运行: 在单节点运行周期中最多执行一次, 且只有在Hive UDF发生异常时才会执行.

3. Description注解

name
value
extended

@Description(
        name = "udf_name_example",
        value = "_FUNC_ Description Example",
        extended = "#> SELECT _FUNC_(...) AS result;"
)

4. 代码实例

// package ..;
// import ExtendsGenericUDF;
import org.apache.hadoop.hive.ql.exec.Description;
import org.apache.hadoop.hive.ql.exec.MapredContext;
import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
import org.apache.hadoop.hive.ql.metadata.HiveException;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDF;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;
import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
import org.jetbrains.annotations.NotNull;

import java.util.ArrayList;
import java.util.Arrays;

/*
 * 执行: desc function extended udf_split_example;
 * 查看UDF方法的描述(Description)内容
 * */
@Description(
        name = "udf_split_example",
        value = "_FUNC_(targetString, breakChar): split targetString by breakChar.\n"
                + "Require: String targetString And breakChar.\n"
                + "Return