hive系列(2)---自定义函数UDF

一、自定义UDF

在hql中可以满足大部分的需求,但是可能还有部分特定的业务需要自己自定义函数来实现,下面来模拟实现:
1、表的建立:
t_p

0: jdbc:hive2://mini01:10000> desc t_p;
+-----------+------------+----------+--+
| col_name  | data_type  | comment  |
+-----------+------------+----------+--+
| id        | string     |          |
| name      | string     |          |
+-----------+------------+----------+--+

t_flow

+-----------+------------+----------+--+
| col_name  | data_type  | comment  |
+-----------+------------+----------+--+
| phone     | int        |          |
| flow      | int        |          |
+-----------+------------+----------+--+

hql表的建立就省略了,看看数据:

0: jdbc:hive2://mini01:10000> select * from t_p;
+---------+-----------+--+
| t_p.id  | t_p.name  |
+---------+-----------+--+
| 14      | TIANJUN   |
| 1       | ni        |
| 2       | hao       |
| 3       | wo        |
| 4       | shi       |
| 5       | zhong     |
| 6       | guo       |
| 7       | ren       |
| 8       | ha        |
| 9       | wa        |
| 10      | ni        |
| 12      | wo        |
| 13      | ta        |
+---------+-----------+--+
0: jdbc:hive2://mini01:10000> select * from t_flow;
+---------------+--------------+--+
| t_flow.phone  | t_flow.flow  |
+---------------+--------------+--+
| 136214        | 10           |
| 137123        | 14           |
| 137123        | 14           |
| 136123        | 14           |
| 137123        | 14           |
| 138123        | 14           |
| 137123        | 14           |
| 136127        | 14           |
+---------------+--------------+--+

2、自定义函数的实现:
可以通过重载实现函数的多样性,函数必须是public

package hive.lower;

import org.apache.hadoop.hive.ql.exec.UDF;

import java.util.HashMap;
import java.util.Map;

/**
 * hive自定义函数
 * Created by tianjun on 2017/3/21.
 */
public class ToLowerCase extends UDF {

    public Map<String,String> province = new HashMap<>();


    //hive使用,必须是public
    public String evaluate(String field){

        String result = field.toLowerCase();
        return result;

    }

    public String evaluate(int number){

        province.put("136","beigjing");
        province.put("137","shanghai");
        province.put("138","wuhan");

        String pnb = String.valueOf(number);
        return province.get(pnb.substring(0,3))==null?"unknow":province.get(pnb.substring(0,3));
    }

    public static void main(String[] args) {
        System.out.println(new ToLowerCase().evaluate(137214));
    }
}

二、步骤

1、打成jar包上传到服务器
2、将jar包添加到hive的classpath

hive>add JAR /root/original-mr.demo-1.0-SNAPSHOT.jar;

3、创建临时函数与开发好的java class关联

Hive>create temporary function province as 'hive.lower.ToLowerCase';

t_p的查询:

0: jdbc:hive2://mini01:10000> select id, province(name) lowerName from t_p;
+-----+------------+--+
| id  | lowername  |
+-----+------------+--+
| 14  | tianjun    |
| 1   | ni         |
| 2   | hao        |
| 3   | wo         |
| 4   | shi        |
| 5   | zhong      |
| 6   | guo        |
| 7   | ren        |
| 8   | ha         |
| 9   | wa         |
| 10  | ni         |
| 12  | wo         |
| 13  | ta         |
+-----+------------+--+

t_flow的查询

0: jdbc:hive2://mini01:10000> select phone,province(phone) province ,flow from t_flow;
+---------+-----------+-------+--+
|  phone  | province  | flow  |
+---------+-----------+-------+--+
| 136214  | beigjing  | 10    |
| 137123  | shanghai  | 14    |
| 137123  | shanghai  | 14    |
| 136123  | beigjing  | 14    |
| 137123  | shanghai  | 14    |
| 138123  | wuhan     | 14    |
| 137123  | shanghai  | 14    |
| 136127  | beigjing  | 14    |
+---------+-----------+-------+--+
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值