Elasticsearch-28.第⼆部分总结与测验

最新推荐文章于 2024-11-04 12:35:03 发布

飘然渡沧海

最新推荐文章于 2024-11-04 12:35:03 发布

阅读量86

点赞数

分类专栏： elasticsearch 文章标签： elasticsearch 搜索引擎经验分享数据分析

本文链接：https://blog.csdn.net/zhougubei/article/details/124130679

版权

elasticsearch 专栏收录该内容

41 篇文章 1 订阅

订阅专栏

Elasticsearch

第⼆部分总结与测验

回顾总结：搜索与算分

结构化搜索与⾮结构化搜索
- Term 查询和基于全⽂本 Match 搜索的区别
- 对于需要做精确匹配的字段，需要做聚合分析的字段，字段类型设置为 Keyword
Query Context v.s Filter Context
- Filter Context 可以避免算分，并且利⽤缓存
- Bool 查询中 Filter 和 Must Not 都属于 Filter Context
搜索的算分
- TF-IDF / 字段 Boosting
单字符串多字段查询：multi-match
- Best_Field / Most_Fields / Cross_Field
提⾼搜索的相关性
- 多语⾔：设置⼦字段和不同的分词器提升搜索的效果
- Search Template 分离代码逻辑和搜索 DSL
- 多测试，监控及分析⽤户的搜索语句和搜索效果

回顾总结：聚合 / 分⻚

聚合
- Bucket / Metric / Pipeline
分⻚
- From & Size / Search After / Scroll API
- 要避免深度分⻚，对于数据导出等操作，可以使⽤ Scroll API

回顾总结：Elasticsearch 的分布式模型

⽂档的分布式存储
- ⽂档通过 hash 算法， route 并存储到相应的分⽚
分⽚及其内部的⼯作机制
- Segment / Transaction Log / Refresh / Merge
分布式查询和聚合分析的内部机制
- Query Then Fetch；IDF 不是基于全局，⽽是基于分⽚计算，，因此，数据量少的时候，算分不准
- 增加 “shard_size” 可以提⾼ Terms 聚合的精准度

回顾总结：数据建模及重要性

数据建模
- ES 如何处理管理关系 / 数据建模的常⻅步骤 / 建模的最佳实践
建模相关的⼯具
- Index Template / Dynamic Template / Ingest Node / Update By Query / Reindex / Index Alias
最佳实践
- 避免过多的字段 / 避免 wildcard 查询 / 在 Mapping 中设置合适的字段

测试

在这里插入图片描述

DELETE test
PUT test/_doc/1
{
  "content":"Hello World"
}

POST test/_search
{
  "profile": "true",
  "query": {
    "match": {
      "content": "Hello World"
    }
  }
}

POST test/_search
{
  "profile": "true",
  "query": {
    "match": {
      "content": "hello world"
    }
  }
}

POST test/_search
{
  "profile": "true",
  "query": {
    "match": {
      "content.keyword": "Hello World"
    }
  }
}

POST test/_search
{
  "profile": "true",
  "query": {
    "match": {
      "content.keyword": "hello world"
    }
  }
}

POST test/_search
{
  "profile": "true",
  "query": {
    "term": {
      "content": "Hello World"
    }
  }
}

POST test/_search
{
  "profile": "true",
  "query": {
    "term": {
      "content": "hello world"
    }
  }
}

POST test/_search
{
  "profile": "true",
  "query": {
    "term": {
      "content.keyword": "Hello World"
    }
  }
}