Hive基于ES的外部表使用

作用:

1.将Hive数据导入ES
2.Hive直接使用ES的数据

步骤

一、配置依赖jar

1.临时生效

启动HIVE CLI后,ADD JAR /path/elasticsearch-hadoop-xxx.jar;
或
bin/hive --auxpath=/path/elasticsearch-hadoop-xxx.jar
或
bin/hive -hiveconf hive.aux.jars.path=/path/elasticsearch-hadoop-xxx.jar

2.永久生效
hive-site.xml 添加配置

<property>
  <name>hive.aux.jars.path</name>
  <value>/path/elasticsearch-hadoop-xxx.jar</value>
  <description>A comma separated list (with no spaces) of the jar files</description>
</property>

CDH版可直接配置,将jar放在配置目录,重启HiveServer2即可。参考:https://blog.csdn.net/qq_23146763/article/details/88897243

二、在Hive创建外部表

CREATE EXTERNAL TABLE tmp.artists (
id string,
user_name string,
age string) 
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' 
TBLPROPERTIES(
'es.resource' = 'radio/artists',
'es.index.auto.create' = 'true',
"es.mapping.id" = "id",
'es.mapping.names' = 'user_name:user_name_es, age:age_es',
'es.nodes' = 'IP1:9200,IP2:9200,IP3:9200',
"es.net.http.auth.user"="XXX",
"es.net.http.auth.pass"="XXX"
);

es.mapping.names为hive字段和es字段映射,hive只支持小写名,es字段大小写都支持
es.net.http.auth.user和es.net.http.auth.pass是xpack权限控制账号密码,没开启可以不配置。开启了不配置会报错如下:

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'

三、导入数据

用一张有数据的表导入

INSERT OVERWRITE TABLE tmp.artists
SELECT personal_user_id as id, personal_user_name as user_name ,personal_age as age FROM xxx limit 100;

四、查看数据

通过hive查看

在这里插入图片描述

查看es数据
在这里插入图片描述

官网:https://www.elastic.co/guide/en/elasticsearch/hadoop/5.5/hive.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值