hive UDF实现一个字符串解码函数

其实hive的udf 是比较容易实现的,只需要继承UDF,实现其evaluate()方法,代码如下。

@Description(name = "decoder_url", value = "_FUNC_(url [,code][,count]) - decoder a URL from a String for count times using code as encoding scheme ", extended = ""
    + "if count is not given ,the url will be decoderd for 2 time,"
    + "if code is not given ,GBK is used")
public class UDFDecoderUrl extends UDF {
  private String url = null;
  private int times = 2;
  private String code = "GBK";

  public UDFDecoderUrl() {
  }

  public String evaluate(String urlStr, String srcCode, int count) {
    if (urlStr == null) {
      return null;
    }
    if (count <= 0) {
      return urlStr;
    }
    if (srcCode != null) {
      code = srcCode;
    }
    url = urlStr;
    times = count;
    for (int i = 0; i < times; i++) {
      url = decoder(url, code);
    }
    return url;
  }

  public String evaluate(String urlStr, String srcCode) {
    if (urlStr == null) {
      return null;
    }
    url = urlStr;
    code = srcCode;
    return evaluate(url, code,times);
  }

  public String evaluate(String urlStr, int count) {
    if (urlStr == null) {
      return null;
    }
    if (count <= 0) {
      return urlStr;
    }
    url = urlStr;
    times = count;
    
    return evaluate(url, code,times);
  }

  public String evaluate(String urlStr) {
    if (urlStr == null) {
      return null;
    }
    url = urlStr;
    return evaluate(url, code,times);
  }

  private String decoder(String urlStr, String code) {
    if (urlStr == null || code == null) {
      return null;
    }
    try {
      urlStr = URLDecoder.decode(urlStr, code);
    } catch (Exception e) {
      return null;
    }
    return urlStr;
  }
}

在类中org.apache.hadoop.hive.ql.exec.FunctionRegistry中添加

registerUDF("decoder_url", UDFDecoderUrl.class, false);

编译hive ,或者通过配置文件方式,让其读取,以后新加的函数配置到配置文件中一劳永逸。

上面的类UDFDecoderUrl需要打成jar包加载到hive中,需要再hive-site.xml配置如下加载jar包

<property>
<name>hive.aux.jars.path</name>
<value>file:///opt/hive/sohu/hive-udf-0.0.1.jar</value>
<description>These JAR file are available to all users for all jobs</description>
</property>


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值