编写 UDF
- 低版本实现
org.apache.hadoop.hive.ql.exec.UDF
; - 高版本实现
org.apache.hadoop.hive.ql.udf.generic.GenericUDF
;
- 本文操作 hive 版本 2.1.1,使用
GenericUDF
;- 实现后打包文件传到 hive 所在机器备用;
测试
临时函数
关闭会话就结束了生命周期,下次要想使用,需要重新注册。
操作如下:
zhds@apache250:/opt/hive-2.1.1$ bin/beeline
Beeline version 2.1.1 by Apache Hive
beeline> !connect jdbc:hive2://localhost:10000/default
#...
#临时注册UDF函数
0: jdbc:hive2://localhost:10000> add jar /home/zhds/hive-udf-1.0-SNAPSHOT.jar;
No rows affected (0.091 seconds)
0: jdbc:hive2://localhost:10000> CREATE TEMPORARY FUNCTION aesencrypt AS 'indi.yolo.sample.hive.udf.generic.AESEncrypt';
No rows affected (0.163 seconds)
0: jdbc:hive2://localhost:10000> select id,aesencrypt(name,'123') as name,hobby,address from test001;
+-----+-----------------------------------+--------+---------------+--+
| id | name | hobby | address |
+-----+-----------------------------------+--------+---------------+--+
| 1 | D9DCF5C07F6FC7EF72F8BC45060719CE | book | beijing |
| 2 | 6B514FACA01B803F95F49392C5373FFA | tv | nanjing |
| 3 | 91047E888B5DE4F65A3701D33AA7FE44 | music | heilongjiang |
+-----+-----------------------------------+--------+---------------+--+
3 rows selected (0.307 seconds)
0: jdbc:hive2://localhost:10000> select * from test001;
+-------------+---------------+----------------+------------------+--+
| test001.id | test001.name | test001.hobby | test001.address |
+-------------+---------------+----------------+------------------+--+
| 1 | xiaoming | book | beijing |
| 2 | lilei | tv | nanjing |
| 3 | lihua | music | heilongjiang |
+-------------+---------------+----------------+------------------+--+
3 rows selected (0.308 seconds)
永久函数
永久函数一旦注册,可以在hive cli
,远程连接 hiveserver2 等地方永久使用
区别在函数注册上,其他同上。
- 上传 jar 包到 hdfs:
hdfs dfs -put /home/zhds/hive-udf-1.0-SNAPSHOT.jar /input
- 注册函数:
CREATE FUNCTION aesencrypt AS 'indi.yolo.sample.hive.udf.generic.AESEncrypt' USING JAR 'hdfs://hostname:9000/input/hive-udf-1.0-SNAPSHOT.jar';
关闭
hive cli
再打开以及其他机器使用 jdbc 连接均可操作此函数。