ElasticSearch

最新推荐文章于 2023-09-06 14:36:50 发布

代码还是烂到家

最新推荐文章于 2023-09-06 14:36:50 发布

阅读量553

点赞数

分类专栏： java 文章标签： elasticsearch 搜索引擎 java

本文链接：https://blog.csdn.net/sxj159753/article/details/106966443

版权

java 专栏收录该内容

32 篇文章 1 订阅

订阅专栏

ElasticSearch简介

近实时的(Near Real time)，分布式存储、搜索、分析引擎。

什么功能？

比如我们去互联网进行搜索，电商网站，招聘网站等。都会有一个搜索框。这些都可以用Es来实现。
在这里插入图片描述
elasticSearch仅仅是生态圈中的一员：

怎么下载？

来到官网，下载相应版本即可。
下载地址
https://www.elastic.co/cn/downloads/elasticsearch
在这里插入图片描述

目录结构什么样子？
在这里插入图片描述

倒排索引：

正排索引和倒排索引

在这里插入图片描述

倒排索引的核心组成：

两个部分：

1.单词词典（Term Dictionary) ,记录所有文档的单词，记录单词到倒排列表的关联关系。

单词词典一般比较大，可以通过B+树或者Hash链表法来实现，以满足高性能的插入与查询。

2.倒排列表（Posting List) :记录了单词对应的文档结合，由倒排索引项组成

倒排索引项（posting）

文档ID
词频TF-该单词在文档中出现的次数，用于相关性的评分
位置（position) 单词在文档中分词的位置。用于语句的搜索
偏移（Offset) 记录单词的开始结束位置，实现高亮显示

在这里插入图片描述

特点：

JSON文档中的每个字段，都有自己的倒排索引

可以指定对某些字段不做索引

优点：节省存储空间
缺点：字段无法被搜索

Analysis 与Analyzer

Analysis是把全文本转换成一系列的单词（term/token)的过程，也叫分词

Analysis是通过Analyzer来实现的

可使用Elasticsearch内置的分析器、或者按需定制化分析器

除了在数据写入时转换词条，匹配Query语句的时候也需要用相同的分析器对查询语句进行分析。

在这里插入图片描述

Analyzer的组成

分词器是专门处理分词的组件，Analyzer由3部分组成

Character Filters(针对原始文本处理，例如去除html)、Tokenizer（按照规则切分为单词）、Token Filter(将切分的单词进行加工，小写，删除stopwords增加同义词)

在这里插入图片描述

Elasticsearch的内置分词器

Standard Analyzer 默认分词器，按词切分，小写处理
Simple Analyzer 按照非字母切分（符号被过滤），小写处理
Stop Analyzer 小写处理，停用词过滤
Whitespace Analyzer 部分词，直接将输入当做输出
Keyword Analyzer 部分词，直接将输入当做输出
Patter Analyzer 正则表达式，默认 \w+(非字符分隔)
Language 提供了30多种常见语言的分词器
Customer Analyzer自定义分词器

使用_analyzer API

3种工作方式

直接指定Analyzer进行测试
指定索引的字段进行测试
自定义分词起进行测试

在这里插入图片描述

Standard Analyzer 默认分词器

Standard Analyzer
默认分词器，按词切分，小写处理

Demo

在这里插入图片描述

Simple Analyzer 按照非字母切分（符号被过滤），小写处理

在这里插入图片描述

2去掉了，brown-foxes 按照中划线切分为brown foxes,字母也去除了。Quick转换为quick

Whitespace Analyzer 部分词，直接将输入当做输出

在这里插入图片描述

Stop Analyzer 小写处理，停用词过滤

在这里插入图片描述

相比simple Analyzer 多了stop filter
会把the ,a ,is等修饰词语去除。

Keyword Analyzer 不分词，直接将输入当做输出

Patter Analyzer 正则表达式，默认 \w+(非字符分隔)

通过正则表达式进行分词
默认 \w+(非字符分隔)

在这里插入图片描述

Language Analyzer

为不同语言的国家提供了此 Analyzer

中文分词

中文是句子，切分成一个一个词（不是一个一个字）
英文中，单词有自然的空格作为分离
一句中文，在不同的上下文，有不同的理解
他说的确实在理
这事的确定不下来

中文在上下文中有不同的理解。

ICU Analyzer

在这里插入图片描述

需要安装plugin
ElasticSearch-plugin install analysis-icu
提供了Unicode的支持，更好的支持亚洲语言

在这里插入图片描述

分成了他，说的，确实，在，理
在这里插入图片描述
也算还好的分词。但是在理没有连接在一起。

SearchAPI

分两种：

URI Search :在URL中使用查询参数
Request Body Search: 使用ElasticSearch提供的，基于Json格式的更加完备的Query Domain Specific Language(DSL)

指定查询的索引：

在这里插入图片描述

URI 查询

使用“q”,指定查询字符串
“query string syntax”,kv键值对

在这里插入图片描述

Requst Body

支持 POST GET

在这里插入图片描述

Requst Body 返回的Response

在这里插入图片描述

搜索的相关性 Relevance

比如我们搜索苹果，用户关心的是搜索结果的相关性。

是否可以找到所有的相关的内容
有多少不相关的返回了
-文档的打分是否合理
结合业务需求，平衡结果排名

例如我们搜索百度，首先会进行搜索结果进行排序，然后根据评分，在旁边进行相关的推荐。

Page Rank算法

不仅仅是内容
更重要的是内容的可信度

在电商中，搜索商品更多的是扮演销售的角色
首先会提高用户的购物体验
提升网站的销售业绩
去库存

衡量相关性

计算机科学领域
Information Retrieval

precision(查准率) 尽可能返回较少的无关的文档
Recall (查全率) 尽可能返回较多的文档
Ranking 是否按照相关度进行排序

在这里插入图片描述

elasticSearch版本

5.x

lucence 6.x 默认打分机制从tf-IDF改为 BM 25
支持ingest 节点 / Painless Scripting/ Completion suggested 支持/原生的Java Rest客户端。
Type 标记成deprecated ,支持keyword的类型。
性能优化
- 内部引擎移除了避免同一文档并发更新的竞争锁，带来15%-20%提升
- Instant aggregation ，支持分片上聚合的缓存
- 新增profile API

GET _cat/health?v

green
yellow
red

yellow 一个node。默认每个index分配5个 primary shard 和replica shard.

新增

默认加了一个索引。

put /index/type/id
PUT /ecommerce/product/1
{
  "name":"gaolujie yagao",
  "desc":"gaolujie meibai",
  "price":30,
  "producer":"gaolujie producer",
  "tags":["meibai","fangzhu"]
}

查询

GET /index/type/id

GET /ecommerce/product/1

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 1,
  "found": true, //是否找到
  "_source": {
    "name": "gaolujie yagao",
    "desc": "gaoxiao meibai",
    "price": 30,
    "producer": "gaolujie producer",
    "tags": [
      "meibai",
      "fangzhu"
    ]
  }
}

替换修改

PUT

PUT 需要把所有的都添加上,修改之后版本号就变了。需要带上所有的信息。

PUT ecommerce/product/1
{
  "name":"gaolujie yagao",
  "desc":"gaoxiao meibai",
  "price": 30,
  "producer": "gaolujie producer",
  "tags": ["meibai","fangzhu"]
}


{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 2,  //版本号变了
  "result": "updated",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "created": false
}

POST修改

不需要全部修改，只改一部分就行了。 需要加上doc字段。 _update 后面也得加上。

POST /ecommerce/product/1/_update
{
  "doc": {
    "name":"jiaqinagban gaolujie yagao"
  }
}

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 3,
  "found": true,
  "_source": { 
    "name": "jiaqinagban gaolujie yagao",
    "desc": "gaoxiao meibai",
    "price": 30,
    "producer": "gaolujie producer",
    "tags": [
      "meibai",
      "fangzhu"
    ]
  }
}

delete 删除

DELETE ecommerce/product/1

{
  "found": true,
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 4,
  "result": "deleted",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  }
}

再次get发现没了。
GET  ecommerce/product/1

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "found": false  //找不到了
}

多种搜索方式

6种方式

1.query string search

query string search

GET  ecommerce/product/_search

添加了_search。

{
  "took": 9,   //耗费了多少秒
  "timed_out": false, //是否超时
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3, //查询结果的数量
    "max_score": 1, //document对于一个search相关度的匹配分数，越相关，就越匹配，分数也越高。
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "jiajieshi meibai",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "zhognhua yagao",
          "desc": "zhognhua meibai",
          "price": 30,
          "producer": "zhognhua producer",
          "tags": [
            "qingxin"
          ]
        }
      }
    ]
  }
}

但是如果搜索包含yagao的商品，按照售价降序排序，就需要带参数。

GET /ecommerce/product/_search?q=name:yagao&sort=price:desc

这种不太常用。

query dsl（domain specified language）

查询所有商品

GET /ecommerce/product/_search
{
  "query": {
    "match_all": {}  //查询所有
  }
}

查询包含yagao的商品，同时按照价格降序排序。

匹配，排序

GET /ecommerce/product/_search
{
  "query": {
    "match": {
      "name": "yagao"
    }
  },
  "sort": [
    {
        "price": "desc"
    }
  ]
}

查询所有，分页：

GET ecommerce/product/_search
{
  "query": {
    "match_all": {}
  },
  "from": 1,
  "size": 1
}

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        }
      }
    ]
  }
}

查询name price

只包含了 价格和名称两个字段。
GET /ecommerce/product/_search
{
  "query": {
    "match_all": {}
  },
  "_source": ["name","price"]
}


{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 1,
        "_source": {
          "price": 25,  //只包含价格和名称
          "name": "jiajieshi yagao"  //只包含价格和名称
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 1,
        "_source": {
          "price": 30,  //只包含价格和名称
          "name": "zhognhua yagao"  //只包含价格和名称
        }
      }
    ]
  }
}

query filter

查询匹配 yagao 并且价格大于30

GET /ecommerce/product/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "yagao"
          }
        }
      ],
      "filter": {
        "range": {
          "price": {
            "gt": 30
          }
        }
      }
    }
  }
}


结果：

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.25811607,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 0.25811607,
        "_source": {
          "name": "zhognhua yagao",
          "desc": "zhognhua meibai",
          "price": 40,
          "producer": "zhognhua producer",
          "tags": [
            "qingxin"
          ]
        }
      }
    ]
  }
}

全文检索

GET  ecommerce/product/_search
{
  "query": {
    "match": {
      "producer": "yagao producer"
    }
  }
}

全文检索就是比如匹配 yagao producer ，这个会先进行拆分，分成yagao 和producer。

包含yagao 是可以的。

包含yagao producer也是可以的。

但是匹配的相关度分数是不一样的。 max_score是不一样的。

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 0.70293105,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "4",
        "_score": 0.70293105,
        "_source": {
          "name": "special yagao",
          "desc": "special meibai",
          "price": 50,
          "producer": "special yagao producer",
          "tags": [
            "meibai"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 0.25811607,
        "_source": {
          "name": "gaolujie yagao",
          "desc": "gaolujie meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 0.25811607,
        "_source": {
          "name": "zhognhua yagao",
          "desc": "zhognhua meibai",
          "price": 40,
          "producer": "zhognhua producer",
          "tags": [
            "qingxin"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 0.1805489,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "jiajieshi meibai",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        }
      }
    ]
  }
}

短语搜索

match_phrase

跟全文检索对应，相反，全文检索拆解，只要匹配上任意一个拆解后的，就可以作为结果返回。

短语搜索，必须全部匹配。

GET  ecommerce/product/_search
{
  "query": {
    "match_phrase": {
      "producer": "yagao producer"
    }
  }
}

结果：
{
  "took": 13,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.70293105,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "4",
        "_score": 0.70293105,
        "_source": {
          "name": "special yagao",
          "desc": "special meibai",
          "price": 50,
          "producer": "special yagao producer",
          "tags": [
            "meibai"
          ]
        }
      }
    ]
  }
}

高亮

文本中有搜索词，就有高亮。

GET /ecommerce/product/_search
{
  "query": {
    "match": {
      "producer": "producer"
    }
  },
  "highlight": {
    "fields": {
      "producer": {
        
      }
    }
  }
}

结果：
{
  "took": 29,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 0.25811607,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 0.25811607,
        "_source": {
          "name": "gaolujie yagao",
          "desc": "gaolujie meibai",
          "price": 30,
          "producer": "gaolujie producer", //高亮
          "tags": [
            "meibai",
            "fangzhu"
          ]
        },
        "highlight": {
          "producer": [
            "gaolujie <em>producer</em>" //高亮
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 0.25811607,
        "_source": {
          "name": "zhognhua yagao",
          "desc": "zhognhua meibai",
          "price": 40,
          "producer": "zhognhua producer",
          "tags": [
            "qingxin"
          ]
        },
        "highlight": {
          "producer": [
            "zhognhua <em>producer</em>" //高亮
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 0.1805489,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "jiajieshi meibai",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        },
        "highlight": {
          "producer": [
            "jiajieshi <em>producer</em>"  //高亮
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "4",
        "_score": 0.14638957,
        "_source": {
          "name": "special yagao",
          "desc": "special meibai",
          "price": 50,
          "producer": "special yagao producer",
          "tags": [
            "meibai"
          ]
        },
        "highlight": {
          "producer": [
            "special yagao <em>producer</em>" //高亮
          ]
        }
      }
    ]
  }
}

group by avg sort

计算每个tag下商品数量：

GET /ecommerce/product/_search
{
  "aggs": {   //进行聚合
    "group_by_tags": { //随便起个名字
      "terms": {    //按照指定的 field来进行分组。
        "field": "tags"
      }
    }
  }
}

执行之后报错：

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        //需要将 fielddata字段 设置为true
        "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [tags] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "ecommerce",
        "node": "iVnWx48_S9mfShtln1rhig",
        "reason": {
          "type": "illegal_argument_exception",
          "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [tags] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."
        }
      }
    ],
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [tags] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."
    }
  },
  "status": 400
}

需要将 fielddata字段设置为true

执行一下

PUT /ecommerce/_mapping/product
{
  "properties": {
    "tags":{
      "type": "text",
      "fielddata": true
    }
  }
}

返回：
{
  "acknowledged": true
}

再次执行聚合函数：


GET /ecommerce/product/_search
{
  "aggs": {
    "group_by_tags": {
      "terms": {
        "field": "tags"
      }
    }
  }
}

结果：

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 1,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "jiajieshi meibai",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "4",
        "_score": 1,
        "_source": {
          "name": "special yagao",
          "desc": "special meibai",
          "price": 50,
          "producer": "special yagao producer",
          "tags": [
            "meibai"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "gaolujie yagao",
          "desc": "gaolujie meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "zhognhua yagao",
          "desc": "zhognhua meibai",
          "price": 40,
          "producer": "zhognhua producer",
          "tags": [
            "qingxin"
          ]
        }
      }
    ]
  },
  
  //主要看这一段：
  "aggregations": {
    "group_by_tags": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "fangzhu",
          "doc_count": 2   //每个有多少个
        },
        {
          "key": "meibai",
          "doc_count": 2 //每个有多少个
        },
        {
          "key": "qingxin",
          "doc_count": 1 //每个有多少个
        }
      ]
    }
  }
}

如果不想要上面那些可以设置一个size 0

GET /ecommerce/product/_search
{
   "size":0,
  "aggs": {
    "group_by_tags": {
      "terms": {
        "field": "tags"
      }
    }
  }
}

结果：

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "group_by_tags": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "fangzhu",
          "doc_count": 2
        },
        {
          "key": "meibai",
          "doc_count": 2
        },
        {
          "key": "qingxin",
          "doc_count": 1
        }
      ]
    }
  }
}

对名称中包含yagao的商品，计算每个tags下的商品数量。

搜索结果上面添加query就行了。

GET /ecommerce/product/_search
{
  "size": 0,
  "query": {
    "match": {
      "name": "yagao"
    }
  },
  "aggs": {
    "all_tags": {
      "terms": {
        "field": "tags"
      }
    }
  }
}

嵌套聚合：

先分组，在算每组的平均值，计算每个tag下的商品的平均价格。

GET  ecommerce/product/_search
{
  "aggs": {
    "group_by_tags": {
      "terms": {
        "field": "tags"
      },
      "aggs": {
        "avg_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果：

{
  "took": 12,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 1,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "jiajieshi meibai",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "4",
        "_score": 1,
        "_source": {
          "name": "special yagao",
          "desc": "special meibai",
          "price": 50,
          "producer": "special yagao producer",
          "tags": [
            "meibai"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "gaolujie yagao",
          "desc": "gaolujie meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "zhognhua yagao",
          "desc": "zhognhua meibai",
          "price": 40,
          "producer": "zhognhua producer",
          "tags": [
            "qingxin"
          ]
        }
      }
    ]
  },
  "aggregations": {
    "group_by_tags": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "fangzhu",
          "doc_count": 2,
          "avg_price": {
            "value": 27.5
          }
        },
        {
          "key": "meibai",
          "doc_count": 2,
          "avg_price": {
            "value": 40
          }
        },
        {
          "key": "qingxin",
          "doc_count": 1,
          "avg_price": {
            "value": 40
          }
        }
      ]
    }
  }
}

按照指定的价格范围区间进行分组，然后再每组内再按照tag进行分组，最后在计算每组的平均价格

价格区间： range


GET /ecommerce/product/_search
{
  "aggs": {
    "group_by_price": {
      "range": {
        "field": "price",
        "ranges": [
          {
            "from": 0,
            "to": 20
          },
          {
            "from": 20,
            "to": 40
          },
          {
            "from": 40,
            "to": 60
          }
        ]
      },
      "aggs": {
        "group_by_price": {
          "terms": {
            "field": "tags"
          },
          "aggs": {
            "avg_price": {
              "avg": {
                "field": "price"
              }
            }
          }
        }
      }
    }
  }
}

{
  "took": 8,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 1,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "jiajieshi meibai",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "4",
        "_score": 1,
        "_source": {
          "name": "special yagao",
          "desc": "special meibai",
          "price": 50,
          "producer": "special yagao producer",
          "tags": [
            "meibai"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "gaolujie yagao",
          "desc": "gaolujie meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "zhognhua yagao",
          "desc": "zhognhua meibai",
          "price": 40,
          "producer": "zhognhua producer",
          "tags": [
            "qingxin"
          ]
        }
      }
    ]
  },
  "aggregations": {
    "group_by_price": {
      "buckets": [
        {
          "key": "0.0-20.0",
          "from": 0,
          "to": 20,
          "doc_count": 0,
          "group_by_price": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": []
          }
        },
        {
          "key": "20.0-40.0",
          "from": 20,
          "to": 40,
          "doc_count": 2,
          "group_by_price": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "fangzhu",
                "doc_count": 2,
                "avg_price": {
                  "value": 27.5
                }
              },
              {
                "key": "meibai",
                "doc_count": 1,
                "avg_price": {
                  "value": 30
                }
              }
            ]
          }
        },
        {
          "key": "40.0-60.0",
          "from": 40,
          "to": 60,
          "doc_count": 2,
          "group_by_price": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "meibai",
                "doc_count": 1,
                "avg_price": {
                  "value": 50
                }
              },
              {
                "key": "qingxin",
                "doc_count": 1,
                "avg_price": {
                  "value": 40
                }
              }
            ]
          }
        }
      ]
    }
  }
}

es内部机制

扩容：水平扩容，垂直扩容。

垂直：

6台6t，1台1t,增长8t。

重新2台，2t。替换原来的2t。

水平：

新购置2台，每台1t。总容量8*1.

一般1t.

增加结点：自动rebalance.

节点对等的分布式架构。

代码还是烂到家

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
ElasticSearch

倒排索引：正排索引和倒排索引倒排索引的核心组成：两个部分：1.单词词典（Term Dictionary) ,记录所有文档的单词，记录单词到倒排列表的关联关系。单词词典一般比较大，可以通过B+树或者Hash链表法来实现，以满足高性能的插入与查询。2.倒排列表（Posting List) :记录了单词对应的文档结合，由倒排索引项组成倒排索引项（posting）文档ID词频TF-该单词在文档中出现的次数，用于相关性的评分位置（position) 单词在文档中分词的位置。用于语句的搜索偏移
复制链接

扫一扫