Write phoenix UDF

How to write custom UDF

You can follow these simple steps to write your UDF


  • create a new class derived from org.apache.phoenix.expression.function.ScalarFunction
  • implement the getDataType method which determines the return type of the function
  • implement the evaluate method which gets called to calculate the result for each row. The method is passed a org.apache.phoenix.schema.tuple.Tuple that has the current state of the row and an org.apache.hadoop.hbase.io.ImmutableBytesWritable that needs to be filled in to point to the result of the function execution. The method returns false if not enough information was available to calculate the result (usually because one of its arguments is unknown) and true otherwise.


Below are additional steps for optimizations


  • in order to have the possibility of contributing to the start/stop key of a scan, custom functions need to override the following two methods from ScalarFunction:

    /**

     * Determines whether or not a function may be used to form

     * the start/stop key of a scan

     * @return the zero-based position of the argument to traverse

     *  into to look for a primary key column reference, or

     *  {@value #NO_TRAVERSAL} if the function cannot be used to

     *  form the scan key.

     */

    public int getKeyFormationTraversalIndex() {

        return NO_TRAVERSAL;

    }


    /**

     * Manufactures a KeyPart used to construct the KeyRange given

     * a constant and a comparison operator.

     * @param childPart the KeyPart formulated for the child expression

     *  at the {@link #getKeyFormationTraversalIndex()} position.

     * @return the KeyPart for constructing the KeyRange for this

     *  function.

     */

    public KeyPart newKeyPart(KeyPart childPart) {

        return null;

    }

  • Additionally, to enable an ORDER BY to be optimized out or a GROUP BY to be done in-place,:

    /**

     * Determines whether or not the result of the function invocation

     * will be ordered in the same way as the input to the function.

     * Returning YES enables an optimization to occur when a

     * GROUP BY contains function invocations using the leading PK

     * column(s).

     * @return YES if the function invocation will always preserve order for

     * the inputs versus the outputs and false otherwise, YES_IF_LAST if the

     * function preserves order, but any further column reference would not

     * continue to preserve order, and NO if the function does not preserve

     * order.

     */

    public OrderPreserving preservesOrder() {

        return OrderPreserving.NO;

    }

Limitations

  • The jar containing the UDFs must be manually added/deleted to/from HDFS. Adding new SQL statements for add/remove jars(PHOENIX-1890)
  • Dynamic class loader copy the udf jars to {hbase.local.dir}/jars at the phoenix client/region server when the udf used in queries. The jars must be deleted manually once a function deleted.
  • functional indexes need to manually be rebuilt if the function implementation changes(PHOENIX-1907)
  • once loaded, a jar will not be unloaded, so you’ll need to put modified implementations into a different jar to prevent having to bounce your cluster(PHOENIX-1907)
  • to list the functions you need to query SYSTEM.“FUNCTION” table(PHOENIX-1921))
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值