hive与es交互bug
一、hive数据写入es,hive查询报错(貌似不能查询)
Bad status for request TFetchResultsReq(fetchType=0, operationHandle=TOperationHandle(hasResultSet=True,
modifiedRowCount=None, operationType=0, operationId=THandleIdentifier(secret='\x8d#e\x89\x0bhBg\xb9\xdb\xc7L\xe7lb\xb0',
guid="X\xee.\x81\xd8'Hy\x983\xb7\x00\xcb\x85\x84\x91")),
orientation=4, maxRows=100): TFetchResultsResp(status=TStatus(errorCode=None,
errorMessage='java.lang.NoClassDefFoundError: Could not initialize class org.elasticsearch.hadoop.util.Version',
sqlState=None, infoMessages=['*java.lang.RuntimeException:java.lang.NoClassDefFoundError:
Could not initialize class org.elasticsearch.hadoop.util.Version:19:18',
'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:83',
'org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36',
'org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63',
'java.security.AccessController:doPrivileged:AccessController.java:-2',
'javax.security.auth.Subject:doAs:Subject.java:415',
'org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1783',
'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59', 'com.sun.proxy.$Proxy27:fetchResults::-1',
'org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:440',
'org.apache.hive.service.cli.thrift.ThriftCLIService:FetchResults:ThriftCLIService.java:686',
'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1553',
'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1538',
'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39',
'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39',
'org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56',
'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286',
'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1145',
'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:615',
'java.lang.Thread:run:Thread.java:745', '*java.lang.NoClassDefFoundError:
Could not initialize class org.elasticsearch.hadoop.util.Version:35:16',
'org.elasticsearch.hadoop.rest.RestService:findPartitions:RestService.java:225',
'org.elasticsearch.hadoop.mr.EsInputFormat:getSplits:EsInputFormat.java:457',
'org.elasticsearch.hadoop.hive.EsHiveInputFormat:getSplits:EsHiveInputFormat.java:111',
'org.elasticsearch.hadoop.hive.EsHiveInputFormat:getSplits:EsHiveInputFormat.java:50',
'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextSplits:FetchOperator.java:363',
'org.apache.hadoop.hive.ql.exec.FetchOperator:getRecordReader:FetchOperator.java:295',
'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:446',
'org.apache.hadoop.hive.ql.exec.FetchOperator:pushRow:FetchOperator.java:415',
'org.apache.hadoop.hive.ql.exec.FetchTask:fetch:FetchTask.java:138',
'org.apache.hadoop.hive.ql.Driver:getResults:Driver.java:1987',
'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:361',
'org.apache.hive.service.cli.operation.OperationManager:getOperationNextRowSet:OperationManager.java:277',
'org.apache.hive.service.cli.session.HiveSessionImpl:fetchResults:HiveSessionImpl.java:753',
'sun.reflect.GeneratedMethodAccessor12:invoke::-1',
'sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43',
'java.lang.reflect.Method:invoke:Method.java:606',
'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78'], statusCode=3), results=None, hasMoreRows=None)
我以为是版本问题
应该是hive写入es之后不能进行查询,
映射es原本数据才可以查询
官网原话:
As one can note, currently the reading and writing are treated separately but we're working on unifying the two and automatically translating
HiveQL
to Elasticsearch queries.
无法检测ES版本 - 通常情况下,如果网络/ Elasticsearch发生这种情况
集群不可访问,或者在没有正确设置“es.nodes.wan.only”的情况下定位WAN /云实例时
集群不可访问,或者在没有正确设置“es.nodes.wan.only”的情况下定位WAN /云实例时
参数未设置??关于版本问题,我的 es-2.4.4 ,es-hadoop2.4.4.jar hadoop 2.6.0(cdh版)。版本问题官网上,我只找到具体es哪个版本,es-hadoop哪个版本,却没有说明hadoop哪个版本对应es哪个版本,所以默认是没要求??
Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
{"_col0":63818992,"_col1":"陶悦","_col2":"18716402326","_col3":"201710260063961961","_ccol29":"F","_col30":null8l}
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1783)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing
row {"_col0":63818992,"_col1":"陶悦","_col2":"18716402326","_col3":"201710260063961961","_col6":"id"}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170) ... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception: Unexpected exception:
Unexpected exception: org.apache.hadoop.hive.ql.metadata.HiveException: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException:
Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud
instance without the proper setting 'es.nodes.wan.only' at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:318)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:97)
at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497) ... 9 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception: Unexpected exception: org.apache.hadoop.hive.ql.metadata.HiveException:
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException:
Cannot detect ES version - typically this happens if the network/Elasticsearch
cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:318)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:638)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:748)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:306) ...
13 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception: org.apache.hadoop.hive.ql.metadata.HiveException:
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException:
Cannot detect ES version - typically this happens if the network/Elasticsearch
cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:318)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:638)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:748)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:306) ... 18 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException:
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException:
Cannot detect ES version - typically this happens if the network/Elasticsearch
cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:525)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:623)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:638)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:748)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:306) ... 23 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException:
Cannot detect ES version - typically this happens if the network/Elasticsearch
cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:249)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:570)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:514) ... 31 more
建表时参数未指定
'es.nodes' = '192.68.20.10:9201,12.168.00.110:02,192.18.200.12:923',
'es.index.auto.create' = 'true',--自动创建es索引
'es.resource' = 'es_bigtable/bigtable_list', --索引名称及类型
'es.index.auto.create' = 'true',--自动创建es索引
'es.resource' = 'es_bigtable/bigtable_list', --索引名称及类型
'es.nodes.wan.only'='true',--连接器是否针对广域网上的云/受限环境(例如Amazon Web Services)中的Elasticsearch实例使用。在此模式下,连接器将禁用发现,并且只
'es.mapping.names' =‘’--字段映射
es.nodes
在所有操作(包括读取和写入操作)期间通过声明进行连接。请注意,在这种模式下,性能受到很大 影响。'es.mapping.names' =‘’--字段映射
参数参考官网:https://www.elastic.co/guide/en/elasticsearch/hadoop/current/configuration.html
三、hadoop与es写入速度不一致??
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:318)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:97)
at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497) ... 9 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception:
Unexpected exception: Could not write all entries [94/1047616] (maybe ES was overloaded?). Bailing out...
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:318)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:638)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:748)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:306) ... 13 more
源码:
public void flush() {
BulkResponse bulk = tryFlush();
if (!bulk.getLeftovers().isEmpty()) {
String header = String.format("Could not write all entries [%s/%s] (Maybe ES was overloaded?). Error sample (first [%s] error messages):\n", bulk.getLeftovers().cardinality(), bulk.getTotalWrites(), bulk.getErrorExamples().size());
StringBuilder message = new StringBuilder(header);
for (String errors : bulk.getErrorExamples()) {
message.append("\t").append(errors).append("\n");
}
message.append("Bailing out...");
throw new EsHadoopException(message.toString());
}
}
这个链接更好的回复了这个问题:https://discuss.elastic.co/t/pushback-to-hadoop-from-es-on-bulk-load/1535/5
四、索引只能小写
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest:
Found unrecoverable error [192.168.200.100:9201] returned Bad Request(400) - Invalid index name [Lots_scenic], must be lowercase;