hive中的gis函数
目前,Hive表中存在需要把经纬度转化为距离,和一些其他相关的计算函数,但是在hive中并没有集成这些函数。当然我们可以自定义UDF函数,但是现在提供一个更加简单的方法,通过源码编译的方式,集成GIS函数。
环境:
2 hadoop-2.7 3 4下载源码:https://github.com/Esri/spatial-framework-for-hadoop
源码地址
在github上下载源码在本地idea进行编译,修改最外层的pom.xml,修改hadoop、hive、java版本为生产环境中版本
idea中添加项目
打包:
打完包后 会出现如下俩个jar包 :
1.spatial-sdk-json-2.1.1-SNAPSHOT.jar
2.spatial-sdk-hive-2.1.1-SNAPSHOT.jar
然后下载最新的esri-geometry-java中的esri-geometry-api-2.2.1.jar,上传三个jar到linux系统(jar权限设置成最高),在hive的shell控制台输入添加jar和创建函数语句。
创建函数:
如果jar包在本地则创建临时函数,只在当前session有效,我们可以把jar包上传到HDFS上创建永久函数。
1。add jar /iDatalight/jars/esri-geometry-api-2.2.1.jar;
2。add jar /iDatalight/jars/spatial-sdk-json-2.1.0.jar;
3。add jar /iDatalight/jars/spatial-sdk-hive-2.1.0.jar;
自定义函数使用可以参考官方文档UDF文档,创建自定义函数可以参考spatial-framework-for-hadoop项目的hive下function-ddl.sql,但创建永久函数需要去掉temporary。
CREATE FUNCTION STPoint as ‘com.esri.hadoop.hive.ST_Point’ using jar ‘hdfs:///iDatalight/jars/esri-geometry-api-2.2.2.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-hive-2.1.1.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-json-2.1.1.jar’;
create function ST_AsBinary as ‘com.esri.hadoop.hive.ST_AsBinary’ using jar ‘hdfs:///iDatalight/jars/esri-geometry-api-2.2.2.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-hive-2.1.1.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-json-2.1.1.jar’;
create function ST_AsGeoJSON as ‘com.esri.hadoop.hive.ST_AsGeoJson’ using jar ‘hdfs:///iDatalight/jars/esri-geometry-api-2.2.2.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-hive-2.1.1.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-json-2.1.1.jar’;
create function ST_AsJSON as ‘com.esri.hadoop.hive.ST_AsJson’ using jar ‘hdfs:///iDatalight/jars/esri-geometry-api-2.2.2.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-hive-2.1.1.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-json-2.1.1.jar’;
create function ST_AsShape as ‘com.esri.hadoop.hive.ST_AsShape’ using jar ‘hdfs:///iDatalight/jars/esri-geometry-api-2.2.2.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-hive-2.1.1.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-json-2.1.1.jar’;
create function ST_AsText as ‘com.esri.hadoop.hive.ST_AsText’ using jar ‘hdfs:///iDatalight/jars/esri-geometry-api-2.2.2.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-hive-2.1.1.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-json-2.1.1.jar’;
create function ST_GeomFromJSON as ‘com.esri.hadoop.hive.ST_GeomFromJson’ using jar ‘hdfs:///iDatalight/jars/esri-geometry-api-2.2.2.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-hive-2.1.1.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-json-2.1.1.jar’;
create function ST_GeomFromGeoJSON as ‘com.esri.hadoop.hive.ST_GeomFromGeoJson’ using jar ‘hdfs:///iDatalight/jars/esri-geometry-api-2.2.2.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-hive-2.1.1.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-json-2.1.1.jar’;
create function ST_GeomFromShape as ‘com.esri.hadoop.hive.ST_GeomFromShape’ using jar ‘hdfs:///iDatalight/jars/esri-geometry-api-2.2.2.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-hive-2.1.1.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-json-2.1.1.jar’;
create function ST_GeomFromText as ‘com.esri.hadoop.hive.ST_GeomFromText’ using jar ‘hdfs:///iDatalight/jars/esri-geometry-api-2.2.2.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-hive-2.1.1.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-json-2.1.1.jar’;
create function ST_GeomFromWKB as ‘com.esri.hadoop.hive.ST_GeomFromWKB’ using jar ‘hdfs:///iDatalight/jars/esri-geometry-api-2.2.2.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-hive-2.1.1.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-json-2.1.1.jar’;
create function ST_PointFromWKB as ‘com.esri.hadoop.hive.ST_PointFromWKB’ using jar ‘hdfs:///iDatalight/jars/esri-geometry-api-2.2.2.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-hive-2.1.1.jar’,jar ‘hdfs:///iDatalight/jars/spatial-sdk-json-2.1.1.jar’;