hive写入Elasticsearch实战与性能

最新推荐文章于 2022-06-09 17:35:18 发布

忆山

最新推荐文章于 2022-06-09 17:35:18 发布

阅读量1.1k

点赞数

分类专栏： hive Elasticsearch实战

本文链接：https://blog.csdn.net/qq_34936033/article/details/104263702

版权

hive 同时被 2 个专栏收录

6 篇文章 0 订阅

订阅专栏

Elasticsearch实战

1 篇文章 0 订阅

订阅专栏

一、写入ES的映射

add jar hdfs://****/user/hive/jar/elasticsearch-hadoop-6.6.1.jar; --注意jar包版本，不同elastic集群指定相应的版本jar包
--关闭Hive推测执行
SET hive.mapred.reduce.tasks.speculative.execution = false;
SET mapreduce.map.speculative = false;
SET mapreduce.reduce.speculative = false;
SET hive.execution.engine=mr;

--建表示例
CREATE EXTERNAL table if not exists test.es_write_test (
id string ,
skuid string,
scop string,
type int,
date timestamp,
value string
)COMMENT '测试'
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES(
'es.nodes' = '****:9200,****:9200',--es节点和http端口
'es.index.auto.create' = 'true', --可以自动创建index默认true，如果es已经存在对应index可省略
'es.resource' = 'test_index/write_test', --指定index/type
'es.mapping.id' = 'id', --指定docid字段,es自动生成的话可以省略该配置
'es.mapping.names' = 'id:id,skuid:skuid,scop:scop,type:type,date:date,value: value' --如果es已经提前创建字段写入数据，可以指定字段映射关系
'es.net.http.auth.user'='userName', --用户名
'es.net.http.auth.pass'='**********' --密码
);

二、写入速度限制

为了保证ES服务稳定，需要限制hive写入ES的速率。通过限制写入的任务数，来限制写入速度。MR的map端只能通过文件的分割数确定map数量，而reduce端可以指定最大任务数，所以我们通过sql转换来将数据写入任务在reduce端执行，具体方式我们通过

distribute by 1;--限制写入任务数为1

INSERT OVERWRITE TABLE test.es_write_test
    SELECT 
name,
age
FROM test.test
distribute by 1;--限制reduce写入任务数为1

三、写入性能测试报告

1、测试数据

测试数据为23个字段，涵盖int，date，string，decimal类型数据

2、测试结果对比

未限制写入任务数（两百多个并发写入），五百万左右数据，写入时间2分钟左右

1个任务数写入，五百万数据：写入时间五分钟左右

4个任务数写入，一亿七千万数据：第一次写入时间1小时30分钟，一亿七千万数据：第二次写入时间1小时50分钟

参考：

hive映射ES官方文档：https://www.elastic.co/guide/en/elasticsearch/hadoop/current/hive.html

忆山

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
1
评论
hive写入Elasticsearch实战与性能

一、写入ES的映射add jar hdfs://****/user/hive/jar/elasticsearch-hadoop-6.6.1.jar; --注意jar包版本，不同elastic集群指定相应的版本jar包--关闭Hive推测执行SET hive.mapred.reduce.tasks.speculative.execution = false;SET mapreduce.m...
复制链接

扫一扫