ES查询时_id含有特殊字符的问题

最新推荐文章于 2024-07-10 16:11:09 发布

氪州刺史

最新推荐文章于 2024-07-10 16:11:09 发布

阅读量2.2k

点赞数

分类专栏： es 文章标签： elasticsearch id

本文链接：https://blog.csdn.net/wmx3ng/article/details/117425475

版权

es 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

问题

根据ID获取不到文档, 报错

GET index/_doc/2_900002151162=I1B8PIUB1-66M04493WI71zDLXZliUSgmR9S9eVMLh2/NK3FcIhRi4yf8VU=

注: id是字符串经过处理而来

提示信息

{
  "error": "no handler found for uri [/card_record/_doc/2_900002151162=I1B8PIUB1-66M04493WI71zDLXZliUSgmR9S9eVMLh2/NK3FcIhRi4yf8VU=?pretty] and method [GET]"
}

问题复现

使用index接口索引文档

IndexRequest request = new IndexRequest();
request.index("index-test2");
request.id("1+1");
Map<String, Object> doc = new HashMap<>();
doc.put("doc", 1);
doc.put("type","index");
request.source(doc);
try {
    IndexResponse response = client.index(request, RequestOptions.DEFAULT);
    System.out.println("response = " + JSONObject.toJSONString(response));
} catch (IOException e) {
    e.printStackTrace();
}

使用bulk索引文档

IndexRequest request = new IndexRequest();
request.index("index-test2");
request.id("1+1");
Map<String, Object> doc = new HashMap<>();
doc.put("doc", 2);
doc.put("type","bulk");
request.source(doc);

BulkRequest bulkRequest=new BulkRequest();
bulkRequest.add(request);
try {
    BulkResponse response = client.bulk(bulkRequest, RequestOptions.DEFAULT);
    System.out.println("response = " + JSONObject.toJSONString(response));
} catch (IOException e) {
    e.printStackTrace();
}

kibana上查看

POST index-test2/_search
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "index-test2",
        "_type" : "_doc",
        "_id" : "1 1",
        "_score" : 1.0,
        "_source" : {
          "doc" : 1,
          "type" : "index"
        }
      },
      {
        "_index" : "index-test2",
        "_type" : "_doc",
        "_id" : "1+1",
        "_score" : 1.0,
        "_source" : {
          "doc" : 2,
          "type" : "bulk"
        }
      }
    ]
  }
}

原因分析

类似于浏览器请求, ES服务端接收请求时, 会对浏览器参数做decode操作, 这个是与ES无关;
可以想到解决办法是, 调用index, get, exists等客户端方法时, 先对 _id 调用URLEncode方法;
查看客户端源码, 发现上述方案行不通, 因为ES内部会调用endpoint方法; 且, 这个方法与URLEncode是不同的;

static String endpoint(String index, String type, String id, String endpoint) {
    return new EndpointBuilder().addPathPart(index, type, id).addPathPartAsIs(endpoint).build();
}

实际试验,
- 做URLEncode之后, 可能含有%, endpoint方法会对%作转义, 也就是说, "1+1"会被转成: "1%252B1"
- 服务端接收后, 会去处理 _id 为 "1%2B1"的文档;

     {
        "_index" : "index-test2",
        "_type" : "_doc",
        "_id" : "1%2B1",
        "_score" : 1.0,
        "_source" : {
          "doc" : 1,
          "type" : "index"
        }
      }