1、什么是UA?
用户访问服务器时,所携带的一些基本信息。
日志字段如下:
Mozilla/5.0 (iPhone; CPU iPhone OS 13_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148
Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36
Mozilla/5.0 (iPhone; CPU iPhone OS 13_1_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148 main%2F1.0 baiduboxapp/11.15.5.16 (Baidu; P2 13.1.2 )
Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36
Mozilla/5.0 (iPhone; CPU iPhone OS 13_1_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148 main%2F1.0 baiduboxapp/11.15.5.16 (Baidu; P2 13.1.2 )
Mozilla/5.0 (Linux; Android 9; COL-AL10 Build/HUAWEICOL-AL10) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.155 Mobile Safari/537.36 Zeus
Mozilla/5.0 (iPhone; CPU iPhone OS 13_1_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148 main%2F1.0 baiduboxapp/11.15.5.16 (Baidu; P2 13.1.2 )
Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36
Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36
Mozilla/5.0 (Linux; Android 9; COL-AL10 Build/HUAWEICOL-AL10) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.155 Mobile Safari/537.36 Zeus
2、思路
添加自定义UDF 函数,进行解析。Jar 包有现成的,如何下载如何打包?
步骤如下:
1、用你的IDEAL 创建一个 maven 工程
2、添加maven 依赖 进行 下载 Jar 包
添加依赖:
<dependency>
<groupId>nl.basjes.parse.useragent</groupId>
<artifactId>yauaa-hive</artifactId>
<classifier>udf</classifier>
<version>5.8</version>
</dependency>
版本到 2020-1-6 有如下版本:
5.8\5.6\5.5
打包: 这里要注意我怎么打包!
直接上传到 hive 目录下的
/opt/hive/apache-hive-2.1.1-bin/lib/
最后上传 jar 包 执行下面命令即可:
add jar /opt/hive/apache-hive-2.1.1-bin/lib/yauaa-hive-5.8-udf.jar;
CREATE FUNCTION ParseUserAgent AS 'nl.basjes.parse.useragent.hive.ParseUserAgent';
DESCRIBE FUNCTION EXTENDED ParseUserAgent;
SELECT ParseUserAgent("Mozilla 5.0 Windows NT 10.0 WOW64 AppleWebKit 537.36 KHTML like Gecko Chrome 63.0.3239.132 Safari 537.36").OperatingsystemNameVersion as a;
OK
Windows NT 10.0 WOW64
结果展示:
这个是 具体的介绍方法:
https://github.com/nielsbasjes/yauaa/blob/29cf3abbd2207e0634c9a7f2917c999bb0fd71c7/src/main/docs/UDF-ApacheHive.md