难找的bug（随时更新）

✿ﾟ卡笨卡

已于 2024-05-22 17:00:25 修改

阅读量76

点赞数

分类专栏： java 笔记 ES 文章标签： bug

于 2023-07-13 14:37:03 首次发布

本文链接：https://blog.csdn.net/tian__c/article/details/130126873

版权

笔记同时被 3 个专栏收录

22 篇文章 0 订阅

订阅专栏

java

18 篇文章 1 订阅

订阅专栏

4 篇文章 0 订阅

订阅专栏

1、同时操作数据库的删除新增可能存在的问题

对同一表进行新增删除，顺序因为是随机的可用.flush()令缓存输出，就会先执行delete再执行新增

@Transactional(rollbackFor = Exception.class)

stdSystemMatchDao.deleteByDataId(fileId);
stdSystemMatchDao.flush();
stdSystemMatchDao.saveAll(stdSystemMatcheList);

2、nginx 504 gatway time out

数据访问超时，nginx默认的超时时间是60s

client_max_body_size 300m;
proxy_connect_timeout    600;	# 默认60s
proxy_read_timeout       600;	# 默认60s
proxy_send_timeout       600;	# 默认60s

3、使用ES 默认分词器的bug

假设有这样一条数据

POST /test/doc/1
{
    "message": "13-Aug-2019 22:20:57.025 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in 97207 ms"
}

我们写查询语句去查询

GET /test/_search
{
    "query": {
        "match" : {
            "message" : "apache"
        }
    }
}

---result---

{
    "took": 3,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 0,
        "max_score": null,
        "hits": []
    }
}

发现查询不到
原因：ES 默认的分词器不会以小数点 . 作为分词依据，则 org.apache.catalina.startup.Catalina.start 被认为一个 term ，使用 apache 查询时就无法进行匹配。

解决方法：在创建索引时给指定字段配置支持小数点分词的分词器，这里推荐 ik 分词器，ik 分词器的安装请参考其它教程，以下是具体配置：

ik_max_word: 会将文本做最细粒度的拆分，比如会将“中华人民共和国国歌”拆分为“中华人民共和国,中华人民,中华,华人,人民共和国,人民,人,民,共和国,共和,和,国国,国歌”，会穷尽各种可能的组合，适合 Term Query；

ik_smart: 会做最粗粒度的拆分，比如会将“中华人民共和国国歌”拆分为“中华人民共和国,国歌”，适合 Phrase 查询。

  PUT /test
  {
      "mappings": {
          "doc": {
              "properties": {
                  "message": {
                      "type": "text",
                      "analyzer": "ik_max_word"
                  }
              }
          }
      }
  }

4、使用ES must 全字段匹配

bug：使用name查询，全字段匹配结果查不出来

其实keyword就是不分词的。这里的主要问题是`name`和`name.keyword`是两个完全不同的field，前者是text，后者是keyword。拿term查询匹配name实际就是在拿term匹配text field，而text field是分过词的，所以无法匹配。

name的映射
普通的must查询

GET power_engin/_search
{
    "from": 0,
    "size": 10,
    "query": {
        "bool": {
            "must": [
              {
                "term": {
                    "name": {
                        "value": "尼泊尔Bheri-1水电站（勘测）",
                        "boost": 1
                    }
                }
              }
            ]
        }
    }
}

会发现，查不到数据，因为这里会name会被分词

如果想全字段匹配

GET power_engin/_search
{
    "from": 0,
    "size": 10,
    "query": {
        "bool": {
            "must": [
              {
                "term": {
                    "name.keyword": {
                        "value": "尼泊尔Bheri-1水电站（勘测）",
                        "boost": 1
                    }
                }
              }
            ]
        }
    }
}