hive之AbstractSerDe自定义表的解析

本文介绍了如何处理非结构化的日志文件,将其转化为Hive表结构数据。通过参考Hive源码中的AbstractSerDe,实现自定义解析逻辑,以适应key-value格式的日志内容。并提供了HIVE SQL定义及执行方法。
摘要由CSDN通过智能技术生成
  1. 对于日志文件中非结构性行的格式化处理成表结构数据;如下;需解析key,value
    2019-10-03 00:53:03.624 INFO [resin-port-9001-42][ContentOperationController.java:367] - Collection events:eventsType=operationPage;mac=88CC4525E50C;sn=1208219040576088CC4525E50C;userId=12725254;userType=vod;parentColumnId=38;columnId=541;nowSpm=38.PAGE_DSJ011008.541.0.1570035183613;afterSpm=38.PAGE_DSJ01100605.537.0.1570035181037;pos=POS_LIST;posName=列表;createTime=2019-10-03 00:53:03:END
    2019-10-03 00:54:20.394 INFO [resin-port-9003-50][CommonAuthService.java:162] - Collection events:eventsType=auth_product;mac=C88F26CBDC57;sn=12022161905760C88F26CBDC57;userId=12500868;userType=VOD;contentId=11755;contentType=1;parentColumnId=38;code=S100000;message=鉴通过;operateType=auth_product;createTime=2019-10-03 00:54:20:END
    2019-10-03 00:55:28.791 INFO [resin-port-9002-41][ResumePointController.java:161] - Collection events:eventsType=operateResumePoint;mac=AC4AFE820E40;sn=120535005053F0AC4AFE820E40;userId=12106198;userType=vod;parentColumnId=38;columnId=0;contentId=9151;contentType=1;operateType=get;createTime=2019-10-03 00:55:28:END
    2019-10-03 00:58:46.958 INFO [resin-port-9001-43][ContentOperationController.java:609] - Collection events:eventsType=operationDetails;mac=AC4AFE820E40;sn=120535005053F0AC4AFE820E40;userId=12106198;userType=vod;parentColumnId=38;columnId=4461;contentId=10769;contentType=1;nowSpm=38.PAGE_ALBUM_DETAILS.4461.10769.1570035526958;afterSpm=38.PAGE_DY01100622.4461.10769.1570035516087;pos=PAGE_ALBUM_DETAILS;posName=专辑详情;createTime=2019-10-03 00:58:46:END
    
  2. 可参考hive源码实现自定义的解析; 源码参考路径:HIVE源码SER
    package com.ppfuns;
    
    import com.google.common.base.Splitter;
    import com.google.common.collect.Lists;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.hive.common.type.*;
    import org.apache.hadoop.hive.serde2.AbstractSerDe;
    import org.apache.hadoop.hive.serde2.SerDeException;
    import org.apache.hadoop.hive.serde2.SerDeSpec;
    import org.apache.hadoop.hive.serde2.SerDeStats;
    import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
    import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;
    import org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector;
    import org.apache.hadoop.hive.serde2.objectinspector.primitive.AbstractPrimitiveJavaObjectInspector;
    import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
    i
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值