用户自定义函数(UDF)是一个允许用户扩展HiveQL的强大的功能 show function ; describe function concat; describe function extended concat; 标准函数 UDF 聚合函数 UDAF,从零行到多行的零个到多个列,然后返回单一值 生成表函数 UDTF,零个或多个输入,然后产生多列或多行输出 一个通过日期计算其星座的UDF add /home/zkpk/Desktop/xingzuo.jar create temporary function zodiac as 'day1008.UDFZodiacSign'; select zodiac(bday) from littlebigdata; 代码如下: package day1008; import java.sql.Date; import java.text.SimpleDateFormat; import org.apache.hadoop.hive.ql.exec.UDF; public class UDFZodiacSign extends UDF { private SimpleDateFormat df; public UDFZodiacSign() { df = new SimpleDateFormat("MM-dd-yyyy"); } public String evaluate(Date bday) { return this.evaluate(bday.getMonth(), bday.getDay()); } public String evaluate(String bday) { Date date = null; try { date = (Date) df.parse(bday); } catch (Exception ex) { return null; } return this.evaluate(date.getMonth() + 1, date.getDay()); } public String evaluate(Integer month, Integer day) { if (month == 1) { if (day < 20) { return "Capricorn"; } else { return "Aquarius"; } } if (month == 2) { if (day < 19) { return "Aquarius"; } else { return "Pisces"; } } /* ...other months here */return null; } } 需求1:把keyword中的空格全部去掉 解决思路: 1、编程 “去除空格” 2、add jar /home/zkpk/Desktop/QCKG.jar; 3、create temporary function zodiac as 'day1008.QuChuKongGe'; 4、select zodiac(keyword) from sogou_20111230 limit 100; 5、 代码如下 package day1008; import org.apache.hadoop.hive.ql.exec.UDF; public class QuChuKongGe extends UDF { public static boolean isHave(String i2, String s) { for (int i = 0; i < i2.length(); ) { if (i2.indexOf(s) != -1) { return true; } return false; } return false; } public String evaluate(String i) { if (isHave(i, " ")) { i = i.replace(" ", ""); return i; } return null; } 小马说:“replaceAll(regex, replacement)” ------------------代码另一种实现---------------- import org.apache.hadoop.hive.ql.exec.UDF; public class KeywordTo extends UDF{ public KeywordTo(){ } public String evaluate(String keyword){ String str=keyword.trim(); String[] arr=str.split(" "); String str2=""; int length = arr.length; for(int i=0;i<length;i++){ str2+=arr[i]; } return str2; } } --------------------------------------------------------------- 需求2:根据rank和order计算其排位 如果order=5,rank=5就表示第45个(order-1) 1、编程 “计算排位” 2、add jar /home/zkpk/Desktop/JSPWjar.jar; 3、create temporary function zodiac as 'day1008.JiSuanPaiWei'; 4、select zodiac(rank,order) from sogou_20111230 limit 100; 5、 代码如下 package day1008; import org.apache.hadoop.hive.ql.exec.UDF; public class JiSuanPaiWei extends UDF { public int evaluate(int rank,int order) { if(order>0){ int num; num =(order-1)*10+rank; return num; } return rank; } }