Elastic Search最佳实践

最新推荐文章于 2024-07-19 11:20:25 发布

canfengli

最新推荐文章于 2024-07-19 11:20:25 发布

阅读量506

点赞数

分类专栏： Elastic Search 文章标签： Elastic Search

本文链接：https://blog.csdn.net/canfengli/article/details/101222124

版权

Elastic Search 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

ES最佳实践(6.x)

按天生成index，支撑每天10亿级日志量入库。

ES日志采用按天生成index方法，查询先确定索引位置，如log_20190801，log_20190802。再去ES中查找

动态模板与指定字段类型配合使用，优先采用指定类型，字段类型指定为keyword，不分词。

针对精确查询：使用term/terms + filter可完整查询字段并缓存结果。
针对模糊查询，使用wildcard + term/terms + filter。
指定时间格式为format:“yyyyMMdd HH:ss:mm”。
不分词会减少日志存储量，不需要再去指定一个子字段存储完整信息。

ES分片尽可能少，如对高可用有要求，可适当增加。否则，分片设置为0或1。

此项主要是日志量大，分片过多集群压力容量不够。“number_of_replicas”: “0”, 主分片的拷贝分片个数设置为0.

ES刷新时间增长，增加ES入库吞吐率。频繁刷新入库虽然会及时展示结果，但是会降低吞吐率。

“refresh_interval”: “5s”，刷新时间设置为5s

参考ES动态模板

请尽可能的增加已知类型的properties属性。

可用动态模板

PUT _template/log

{
  "order": 0,
  "index_patterns": [
    "log_*"
  ],
  "settings": {
    "index": {
      "analysis.analyzer.default.type": "ik_max_word",
      "number_of_shards": "32",
      "number_of_replicas": "1",
      "refresh_interval": "60s"
    }
  },
  "mappings": {
    "doc": {
      "dynamic_templates": [
        {
          "string_fields": {
            "match": "*",
            "match_mapping_type": "string",
            "mapping": {
              "analyzer": "ik_max_word",
              "index": true,
              "type": "keyword"
            }
          }
        }
      ],
      "properties": {
        "Time": {
          "type": "date",
          "format": "yyyyMMdd HH:mm:ss"
        },
        "ServiceName": {
          "type": "keyword"
        },
        "ResulrMsg": {
          "type": "keyword"
        },
        "Exfield": {
          "type": "text",
          "analyzer": "ik_max_word",
          "fields": {
            "keyword": {
              "type": "keyword"
            }
          }
        }
      }
    }
  }
}

测试用例

POST log_20190903/doc

{
  "Time": "20190924 18:00:00",
  "ServiceName": "中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首",
  "ResulrMsg":"哈哈hh哈哈哈哈，nishi你是yige你是一个hello",
  "Integer":100,
  "Float":100.0,
  "StrTest":"展示次数超过1000，100,10000的ip",
  "Exfield":"这是一条默认分词的字段，类型为text，keyword是一个子字段属性，默认类型是keyword，不分词，可设置ignore_above:256"
}

动态模板解释

{
  "aion": {
    "order": 0,
    "index_patterns": [
      "aion_*"
    ],
    "settings": {
      "index": {
        "analysis": {...},                // 自定义的分析器
        "number_of_shards": "32",         // 主分片的个数
        "number_of_replicas": "0",        // 主分片的拷贝分片个数
        "refresh_interval": "5s"          // 刷新时间
      }
    },
    "mappings": {
      "doc": {
        "dynamic_templates": [
                 {
            "string_fields": {                                  // 字段映射模板的名称，一般为"类型_fields"的命名方式
                "match": "*",                                   // 匹配的字段名为所有
                "match_mapping_type": "string",                 // 限制匹配的字段类型，只能是 string 类型
                "mapping": {
                    "fielddata": { "format": "disabled" },      // fielddata 不可用，对于分析字段，其默认值是可用
                    "analyzer": "only_words_analyzer",          // 字段采用的分析器名，默认值为 standard 分析器
                    "index": "not_analyzed",                        // 索引方式定义为索引，默认值是分析，目前设置为不分析
                    "omit_norms": false,                         // omit_norms 为真表示考虑字段的加权，可分析字段默认值 false
                    "type": "keyword"                           // 字段类型限定为 keyword
                }
            }

        ],
        "properties": {
          "Time": {
            "type": "date",
            "format": "yyyyMMdd HH:mm:ss"
          }
          "ServiceName":{
          "type": "keyword"
          }
        }
      }
    },
    "aliases": {

    }
  }
}

canfengli

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Elastic Search最佳实践

ES最佳实践(6.x)按天生成index，支撑每天10亿级日志量入库。ES日志采用按天生成index方法，查询先确定索引位置，如log_20190801，log_20190802。再去ES中查找动态模板与指定字段类型配合使用，优先采用指定类型，字段类型指定为keyword，不分词。针对精确查询：使用term/terms + filter可完整查询字段并缓存结果。针对模糊查询，使用w...
复制链接

扫一扫

专栏目录