Hive/Impala 自定义UDF函数一气呵成_impala删除自定义函数-CSDN博客

本文链接：https://blog.csdn.net/AlierSnow/article/details/106793132

Hive/Impala UDF开发流程

1.继承UDF类，重写evaluate()方法并实现函数逻辑（如果方法不存在则自己创建该方法）

依赖包hadoop-common和hive-exec：

<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common -->
<dependency>
  <groupId>org.apache.hadoop</groupId>
  <artifactId>hadoop-common</artifactId>
  <version>2.6.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hive/hive-exec -->
<dependency>
  <groupId>org.apache.hive</groupId>
  <artifactId>hive-exec</artifactId>
  <version>1.1.0</version>
</dependency>

MyFun自定义函数类：

public class MyFunc extends UDF {
    //参数前加Hello,
    public Text evaluate(Text txt){
        return new Text("Hello,"+txt.toString());
    }
}

2.项目打包为jar文件（瘦包即可），导入Linux中，并上传到HDFS文件中

hdfs dfs -mkdir /func
hdfs dfs -put /opt/shops/myhive-1.0-SNAPSHOT.jar /func

Hive

hive中创建函数，自定义函数名，引用继承UDF类的实体类

临时函数创建
add jar方式添加jar包，仅在当前hive窗口有效，一旦退出需要重新add

add jar hdfs://192.168.56.110:9000/func/myhive-1.0-SNAPSHOT.jar;
create function mytest as "com.bdqn.hive.MyFunc";

永久创建函数
create function fucName as “继承类路径” using jar “hdfs中jar文件”

create function mytest as "com.bdqn.hive.MyFunc" using jar "hdfs:/func/myhive-1.0-SNAPSHOT.jar"

hive中使用函数，select funcName(column) …

hive> select * from store_details;
OK
1	NoFrill	10
2	Lablaws	23
3	FoodMart	18
4	FoodLovers	26
5	Walmart	30

hive> select mytest(store_name) from store_details;
OK
Hello,NoFrill	
Hello,Lablaws	
Hello,FoodMart	
Hello,FoodLovers	
Hello,Walmart

Impala

创建临时函数语法
create function [if not exists] [db_name.]function_name(param_type) returns result_type location 'hdfs_path_to_jar' symbol='class_name'
实例：
create function if not exists mytest (string) returns string location "/func/myhive-1.0-SNAPSHOT.jar" symbol="om.bdqn.hive.MyFunc"

创建持久化函数语法
create function [if not exists] [db_name.]function_name location 'hdfs_path_to_jar' symbol='class_name'

create function if not exists default.mytest location "/func/myhive-1.0-SNAPSHOT.jar" symbol="com.bdqn.hive.MyFunc"

函数查询
show functions

删除UDF函数
drop function default.char_count(string);