udf--获取json字符串中所有的key

4 篇文章 0 订阅

hive自带函数get_json_object(…)与json_tuple(…)都是只能获取json字符串中的value值,不能返回key相关的信息。

以下的udf实现的是,获取json字符串中所有的keys

package com.zjs.udf;

import net.sf.json.JSONObject;
import org.apache.hadoop.hive.ql.exec.UDF;

import java.util.Iterator;

/**
 * Created by Administrator on 2017/9/18.
 */
public class GetAllKeys extends UDF {

    public String evaluate(String json_str){

        if(json_str.length() == 0){
            json_str = "{}";
        }

        JSONObject json = JSONObject.fromObject(json_str);

        Iterator it = json.keys();

        String s = "";

        while (it.hasNext()){
            s += "," + it.next();
        }

        return s.length()==0?s:s.substring(1);
    }
}

添加相关的jar包,然后创建udf函数

add jar /home/inf/zhangjishuai/udf/json_get_keys.jar;
add jar /home/inf/zhangjishuai/udf/json-lib-2.3-jdk15.jar;
add jar /home/inf/zhangjishuai/udf/ezmorph-1.0.6.jar;
create temporary function get_json_keys as 'com.zjs.udf.GetAllKeys';

测试

hive> select get_json_keys(detail) from tmp.zjs_0918 limit 5;
Query ID = inf_20170918152828_6f807105-b2d7-4bc3-bad8-de585a612c80
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1502851803550_199792, Tracking URL = http://namenode01:8088/proxy/application_1502851803550_199792/
Kill Command = /opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/lib/hadoop/bin/hadoop job  -kill job_1502851803550_199792
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2017-09-18 15:28:51,570 Stage-1 map = 0%,  reduce = 0%
2017-09-18 15:28:57,721 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.77 sec
MapReduce Total cumulative CPU time: 2 seconds 770 msec
Ended Job = job_1502851803550_199792
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1   Cumulative CPU: 2.77 sec   HDFS Read: 4999 HDFS Write: 32 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 770 msec
OK
jdzgb
hm
hm
hm
hm,dianjia,jdzgb
Time taken: 11.43 seconds, Fetched: 5 row(s)

注意:
需要添加net.sf.json-lib相关的jar包(其中一种方法就是在hive命令行使用add jar添加),不然报错如下:

Caused by: java.lang.ClassNotFoundException: net.sf.json.JSONObject

Query ID = inf_20170918153333_2526d1b8-03c5-4bb8-bb8d-9ee9159825eb
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1502851803550_199796, Tracking URL = http://namenode01:8088/proxy/application_1502851803550_199796/
Kill Command = /opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/lib/hadoop/bin/hadoop job  -kill job_1502851803550_199796
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2017-09-18 15:33:54,432 Stage-1 map = 0%,  reduce = 0%
2017-09-18 15:34:11,947 Stage-1 map = 100%,  reduce = 0%
Ended Job = job_1502851803550_199796 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1502851803550_199796_m_000000 (and more) from job job_1502851803550_199796

Task with the most failures(4): 
-----
Task ID:
  task_1502851803550_199796_m_000000

URL:
  http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1502851803550_199796&tipid=task_1502851803550_199796_m_000000
-----
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"detail":"{\"jdzgb\":\"1\"}"}
    at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"detail":"{\"jdzgb\":\"1\"}"}
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
    at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
    ... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public java.lang.String com.zjs.udf.GetAllKeys.evaluate(java.lang.String)  on object com.zjs.udf.GetAllKeys@7a639ec5 of class com.zjs.udf.GetAllKeys with arguments {{"jdzgb":"1"}:java.lang.String} of size 1
    at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:978)
    at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:182)
    at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:186)
    at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
    at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
    at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
    at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
    at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
    ... 9 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:954)
    ... 18 more
Caused by: java.lang.NoClassDefFoundError: net/sf/json/JSONObject
    at com.zjs.udf.GetAllKeys.evaluate(GetAllKeys.java:19)
    ... 23 more
Caused by: java.lang.ClassNotFoundException: net.sf.json.JSONObject
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 24 more


FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
  • 3
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值