1、引用包:
(1)、引用$HIVE_HOME/lib下的所有jar包;
(2)、引用$HADOOP_HOME/hadoop-core-xx.xx.jar包
2、源码
(1)示例1
package org.robby.hive.udf;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;
public class RobLower extends UDF {
public Text evalute(final Text s) {
if (s == null)
return null;
return new Text(s.toString().toLowerCase());
}
}
(2)示例2
package org.robby.hive.udf;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.w3c.dom.Text;
public class RobBigger3 extends UDF {
public Boolean evaluate(Text s){
if(s==null){
return null;
}
int t= Integer.parseInt(s.toString());
if(t>3){
return true;
}else{
return false;
}
}
}
3、打包输出成jar文件
/home/conkeyn/jar/rob_lower.jar
/home/conkeyn/jar/rob_bigger3.jar
4、在hive命令行中添加jar包和创建自定义函数
hive>add jar /home/conkeyn/jar/rob_lower.jar; hive>create temporary function my_lower as 'org.robby.hive.udf.RobLower'; hive>add jar /home/conkeyn/jar/rob_bigger3.jar; hive>create temporary function my_lower as 'org.robby.hive.udf.RobBigger3';
5、准备测试数据:
BOB 1 AMY 2 ROBBY 3 STEVEN 4
6、创建表:
drop table if exists tab_test;
create table tab_test(a string,b int) row format delimited fields terminated by '\t';
load data local inpath '/home/conkeyn/jar/tab_test1.txt' overwrite into table tab_test;
7、测试自定义函数:
select * from tab_test where my_bigger3(b);
select my_lower(a) from tab_test;
8、注意,如果出现创建函数异常、或者查询时出现字段类型对应不上时,需要退出hive环境,再重新进入hive环境。重新进入时使用debug进入
[conkeyn@hadoop bin]$ hive -hiveconf hive.root.logger=DEBUG,console