elasticsearch

最新推荐文章于 2024-05-15 17:25:07 发布

weixin_45627802

最新推荐文章于 2024-05-15 17:25:07 发布

阅读量202

点赞数

分类专栏： elasticsearch 文章标签： elasticsearch

本文链接：https://blog.csdn.net/weixin_45627802/article/details/107532141

版权

elasticsearch 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

开源elasticsearch是一个基于lucene的实时分布式的搜索与分析引擎。RESTful API标准。
特点及优势
1.分布式的实时文档存储，es可以对json文档类型的数据进行存储，查询，创建，更新，删除等到操作，es已经满足NoSQL存储系统。
2.分布式的实时分析搜索引擎
3.分布式，支持PB级数据

场景
数据量大，es的分布式本质可以进行扩容，承载大量数据；数据结构灵活多变；对数据的操作比较简单；NoSQL数据库。

核心概念

NRT:从写入数据到数据可以被搜索有一个小延迟；基于es执行搜索和分析可以达到秒级。
Document:文档，es中的最小单元，一个document可以是一条商品分类数据…
Index:索引，包含一堆有相似结构的文档数据
Type:已经逐渐取消
shard:单台机器无法存储大量数据，es可以将一个索引中的数据切分为多个shard，分布在多台机器上，提高吞吐量和性能。
replia:任何一个服务随时可能故障或宕机，此时shard可能就会丢失，因此可以为每个shard创建多个replica。

shard与replica

一个index包含多个shard；每个shard都是一个最小工作单元，承载部分数据，lucene实例，完整的建立索引和处理请求的能力；增减节点时，shard会自动在nodes中负载均衡；每个document肯定只存在于某一个primary shard以及其对应的replica shard中；replica shard是primary shard的副本，负责容错，以及承担读请求负载；primary shard的数量在创建索引的时候就固定了，replica shard的数量可以随时修改；primary shard的默认数量是5，replica默认是1，默认10shard，5个是primary shard，5个是replica shard；

横向扩容

primary shard和replica shard自动负载均衡
每个node有更少的shard，IO/CPU/Memory资源给每个shard分配更多，每个shard性能更好
扩容的极限，6个shard最多可以扩容到6台机器，每个shard可以占用单台服务器的所有资源
超出扩容极限，动态修改replica数量，

master选举，replica容错，数据恢复

master node宕机，自动master选举，red
replica容错，新的master将replica提升为primary shard，yellow
重启宕机node，master copy replica到该node，使用原有的shard并同步宕机后的修改，green

document

{
  "_index" : "test_index",    代表一个document存放在哪个index中，命名必须是小写，不能用下划线开头，不能包含逗号
  "_type" : "_doc",            随着7.0版本，type将被移除
  "_id" : "1",                 代表document的唯一标识，与index和type可以唯一标识和定位一个document；自动生成的id长度为20个字符，url安全，base64编码，GUID，分布式系统并行生成时不可能发生冲突。
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

document进行全量替换的时候，es会将老的document标记为deleted，然后新增，当我们创建越来越多的document的时候，es会在后台自动删除(document的delete操作也是同个原理，不会直接进行物理删除)

悲观锁与乐观锁

悲观锁对于应用程序透明，不需要做额外操作，但并发能力低；乐观锁通过程序自己每次比对版本号。
在这里插入图片描述
es主要是基于_version进行乐观锁并发控制；es的后台都是多线程的，也就是，多个修改请求之间，是乱序的，可能后修改的先到，先修改的后到。(对于es的修改都是无序的，可能会造成并发的一些问题。这时我们在操作es的时候可以通过_seq_no这个es自带的字段进行控制注意：一些老的版本es使用version，但是新版本不支持了，会报这个错误，提示我们用if_seq_no和if_primary_term)

PUT test_index/_doc/1?if_seq_no=0&if_primary_term=1
{
  "test_content":"test1"
}

version_type=external,只有当version比es中的_version大的时候，才能修改

PUT test_index/_doc/1?version=3&version_type=external
{
  "test_content":"test11"
}

部分更新

查询，修改和写回都发生在shard内部，dada提升了性能，也大大减少了冲突。retry_on_conflict和_version

POST test_index/_update/1?retry_on_conflict=5&if_seq_no=4&if_primary_term=1
{
  "doc":{
    "test_content":"test11"
  }
}

批量

查询
GET test_index/_mget
{
  "docs":[
    {
      "_id":1
    },
    {
      "_id":2
    }
    ]
}

GET _mget
{
  "docs":[
    {
      "_index":"test_index",
      "_id":1
    },
    {
      "_index":"test_index1",
      "_id":1
    }
    ]
}
bulk语法,delete,create,index,update,
{"action":{"metadata"}}
{"data"}

POST _bulk
{"create":{"_index":"test_index2","_id":1}}
{"test_content":"ddd"}
{"create":{"_index":"test_index2","_id":2}}
{"test_content":"ddd"}
bulk api对json的语法，有严格要求，每个json串不能换行，只能放一行，同时一个json串和一个json串之间，必须有一个换行，采取这个格式的原因是减少内存占用

document路由原理

每次增删改查一个document的时候，都会带过来一个routing number，默认是document的_id,会将routing值传入一个hash函数，产生一个hash值，再对这个index的primary shard的数量求余。
shard=hash(routing)&number_of_primary_shards,所以index的primary shard数量不可变。
document增删改查内部原理
coordinating node->primary shard->replica shard->响应
在这里插入图片描述

写一致性

consistency
one:只要primary shard是active的，就可以执行。
all:必须所有的primary shard和replica shard都是活跃的，才可以执行这个写操作。
quorum：默认值，要求所有的shard中，必须是大部分的shard都是active，才可以执行。
在这里插入图片描述

读请求内部原理

在这里插入图片描述
search timeout机制

GET test_index/_search?timeout=10ms
GET test_index*/_search?timeout=10ms    通配符匹配以test_index打头的index的document
GET test_index*/_search?timeout=10ms&size=10&from=0     分页搜索

clent发送一个搜索请求，会把请求达到所有的primary shard(或其对应的replica shard)
在这里插入图片描述
deep paging

query string

必须包含test_content字段的test开头的
GET test_index*/_search?timeout=10ms&size=1&from=0&q=+test_content:test* 
必须不包含test_content字段的test开头的
GET test_index*/_search?timeout=10ms&size=1&from=0&q=-test_content:test*
直接搜索所有的field，es的_all元数据，在建立索引的时候，我们插入一条document，它里面包含多个field，此时es会自动将多个field的值，全部用字符串的方式串联起来，作为_all field的值，同时建立索引，该搜索结果与mapping有关，分词器，倒排索引
GET test_index*/_search?timeout=10ms&size=1&from=0&q=test*

mapping

往es里面直接插入数据，es会自动建立索引，同时建立对应的mapping；mapping中就自动定义了每个field的数据类型；不同的数据类型，可能有的是exact value，有的是full text；exact value在建立倒排索引时，分词时，是将整个值一起作为一个关键词建立到倒排索引中；full text会经历各种处理，分词，normalization,才会建立到倒排索引中；exact value的field搜索过来时，就是直接按照整个值进行匹配，full text则是进行分词，normalization再去倒排索引中搜索。

mapping是index的type的元数据，每个type都有自己的mapping，决定了数据类型，建立倒排索引的行为，还有进行搜索的行为。

mapping只能创建，新增，不能更新

数据类型

text，string，byte，short，integer，long，float，double，boolean，date

PUT test_index3
{
  "mappings": {
    "properties": {
      "test_content":{
        "type": "text",
        "analyzer": "english"
      }
    }
  }
}
GET test_index3/_analyze
{
  "field": "test_content",
  "text": "I am Lily"
}

query DSL

filter仅仅是按照搜索条件过滤出需要的数据，不需要计算相关度分数，不需要按照相关度分数进行排序，同时还有内置的自动cache最常使用filter的功能。
query会去计算每个document相对于搜索条件的相关度，并按照相关度进行排序，而且无法cache结果。
一般来说，只是根据条件筛选出一部分数据，不关注其排序，就使用filter。
match_all,match,multi_match,range query,term query,terms query；
bool:must,must_not,should,filter;

POST test_index/_search
{
  "query": {
    "match": {
      "test_content": "test22"
    }
  }
}
POST test_index/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "multi_match": {
            "query": "test22",
            "fields": ["test_content"]
          }
        }
      ]
    }
  }
}
POST test_index/_search
{
  "query": {
    "bool": {
      "must_not": [
        {
          "match": {
            "test_content": "test22"
          }
        }
      ]
    }
  }
}
POST test_index/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "test_content": "test22"
          }
        }
      ],
      "filter": [
        {
          "range": {
            "age": {
              "gte": 10,
              "lte": 20
            }
          }
        }
      ]
    }
  }
}
POST test_index/_search   把某个field的数据当成exact value搜索
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "test_content": {
              "value": "test22"
            }
          }
        }
      ]
    }
  }
}
POST test_index/_search    多个搜索词
{
  "query": {
    "bool": {
      "must": [
        {
          "terms": {
            "test_content": [
              "test22",
              "test23"
            ]
          }
        }
      ]
    }
  }
}
POST test_index/_validate/query?explain    判断搜索是否合法
{
  "query": {
    "bool": {
      "must": [
        {
          "exists": {
            "field": "test_content"
          }
        }
      ]
    }
  }
}
POST test_index/_search      排序
{
  "query": {
    "bool": {
      "must": [
        {
          "exists": {
            "field": "test_content"
          }
        }
      ]
    }
  },
  "sort": [
    {
      "_id": {
        "order": "desc"
      }
    }
  ]
}

当对一个string field进行排序，由于分词后是多个单词，再排序不一定是我们想要的结果，我们可以对string field建立两次索引，一个分词用来搜索，一个不分词，用来排序

PUT /article
{
  "mappings": {
    "properties": {
      "title":{
        "type":"text",
        "fielddata": "true",
        "fields": {
          "raw":{
            "type":"keyword"     //index这个只能用true或者false了，如果想要不被分词就把数据类型设置为keyword
          }
        }
      },
      "content":{
        "type": "text"
      },
      "post_date":{
        "type": "date"
      },
      "author_id":{
        "type": "long"
      }
    }
  }
}
POST /article/_search
{
  "query":{
    "bool": {
      "must": [
        {
          "match_all": {}
        }
      ]
    }
  },
  "sort": [
    {
      "title.raw": {
        "order": "desc"
      }
    }
  ]
}

分词器

es内置分析器和按需定制化分析器。
分词器由三个组件组成：character filter，tokenizer，token filter。
在这里插入图片描述
内置分词器：
standard analyzer(按词切分，小写处理),simple analyzer（按照非字母切分，符号被过滤，小写处理），stop analyzer（小写处理，停用词过滤，the a is），whitespace analyzer（按照空格切分，不转小写），keyword analyzer（不分词），pattern analyzer（正则表达式），language，customer analyzer（自定义分词器）

GET _analyze
{
    "analyzer": "standard",
    "text" : "Mastering Elasticsearch , elasticsearch in Action"
}
POST books/_analyze
{
    "field": "title",
    "text": "Mastering Elasticesearch"
}
POST /_analyze
{
    "tokenizer": "standard", 
    "filter": ["lowercase"],
    "text": "Mastering Elasticesearch"
}

倒排索引与正排索引

正排索引一般只能在简单的场景使用，如通过文档id排序，如果搜索关键字就不行了；所以出现了倒排索引，将文件id到关键词的映射关系转换为关键词到文档id的映射。
例如“文档1”经过分词，提取了3个关键词，每个关键词都会记录它所在在文档中的出现频率及出现位置。
那么上面的文档及内容构建的倒排索引结果会如下图（注：这个图里没有记录展示该词在出现在哪个文档的具体位置）：
在这里插入图片描述
倒排索引一般由三部分组成，单词词典，倒排列表(单词出现的文档列表，和在文档中出现的位置和频率等)，倒排文件（存储倒排索引的物理文件）。

在建立索引的时候，一方面会建立倒排索引，以供搜索用；一方面会建立正排索引，也就是doc values，以供排序，聚合，过滤等操作使用。
doc values是被保存在磁盘上的，如果内存足够，os会自动将其缓存在内存中。
倒排索引不可变好处

不需要锁，提升并发能力，避免锁的问题；
数据不变，一直保存在os cache中，只要cache内存足够
filter cache一直驻留在内存，因为数据不变
可以压缩，节省cpu和io开销

bouncing result

bouncing result问题是两个document排序，field值相同，不同的shard上，可能排序不同；每次请求轮询打到不同的replica shard上，解决方法是将preference设置为一个字符串，使得每次搜索落到同个replica shard去执行。
timeout主要设置在限定的一定时间，将部分获取到的数据直接返回，避免查询耗时过长。
routing是document文档路由，通过设置，可以让通过整体的数据落到通过shard上去。
search_type，设置为dfs_query_then_fetch,可以提升revelance sort精准度。

scroll

POST /article/_search?scroll=1m
{
  "query":{
    "bool": {
      "must": [
        {
          "match_all": {}
        }
      ]
    }
  },
  "sort": [
    {
      "title.raw": {
        "order": "desc"
      }
    }
  ],
  "size": 1
}
POST /_search/scroll
{
  "scroll":"1m",
  "scroll_id":"FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFHBQcnhJbmNCTWJTXzJOT2xuN0xwAAAAAAAAAKoWZzRHNlZPQndURGU4ekVKV1Q0dWY2UQ=="
}

索引管理

PUT my_index
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0
  },
  "mappings": {
    "properties": {
      "my_field":{
        "type":"text"
      }
    }
  }
}
PUT my_index/_settings
{
    "number_of_replicas": 1
}
PUT my_index1
{
  "settings": {
    "analysis": {
      "char_filter": {
        "&_to_and":{
          "type":"mapping",
          "mappings":["&=>and"]
        }
      },
      "filter": {
        "my_stopwords":{
          "type":"stop",
          "stopwords":["the","a"]
        }
      },
      "analyzer": {
        "my_analyzer":{
          "type":"custom",
          "char_filter":["html_strip","&_to_and"],
          "tokenizer":"standard",
          "filter":["lowercase","my_stopwords"]
        }
      }
    }
  } 
}
GET my_index1/_analyze
{
  "text": "The quick & brown fox",
  "analyzer": "my_analyzer"
}
mapping json可包括以下参数，properties(type,index,analyzer),metadata(_id,_source,_type),settings(analyzer),
PUT my_index3
{
  "mappings":{
    "properties": {
      "my_field":{
        "type": "text",      //数据类型，不进行分词可以填keyword
        "index": true,		 //是否要进行分词
        "analyzer": "standard"	//分词器
      }
    }
  }
}
定制dynamic策略，true:遇到陌生字段，就进行dynamic mapping；false:遇到陌生字段就忽略；strict:遇到陌生字段就报错。（默认会按照一定格式识别date，比如yyyy-MM-dd,如果某个field先过来一个2017-01-01，就会dynamic mapping成date类型，后续添加其他数据类ing就会报错,可以通过设置date_detection为false）
PUT my_index3/_mapping
{
  "dynamic":"strict" 
}
PUT my_index1
{
  "mappings": {
    "dynamic_templates":[
      {
        "en":{
          "match":"*_en",
          "match_mapping_type":"string",
          "mapping":{
            "type":"text",
            "analyzer":"english"
          }
        }
      }
    ]
  }
}

lucence一个index下的type都是放在一起，所以es后续放弃了type。

当一个index的mappings创建错误解决方案

使用别名，PUT my_index_old/_alias/my_index
使用scroll搜索my_index_old数据，并用_bulk将数据导入新index
使用aliases切换，删除旧index，将新index指向别名index

POST /_aliases
{
  "actions": [
    {
      "remove": {
        "index": "my_index_old",
        "alias": "my_index"
      }
    },
    {
      "add": {
        "index": "my_index_new",
        "alias": "my_index"
      }
    }
  ]
}

document写入原理

在这里插入图片描述

插件

IK Analyzer：IK Analyzer是一个开源的，基于Java语言开发的中文分词工具包。是开源社区中处理中文分词的热门插件。也可以自主进行配置自己的扩展字典。
pinyin Analyzer：拼音分词器。
Smart Chinese Analysis Plugin：Lucene默认的中文分词器。
ICU Analysis plugin：Lucene自带的ICU分词，ICU是一套稳定、成熟、功能强大、轻便易用和跨平台支持Unicode的开发包。
Mapper Attachments Type plugin：附件类型插件，通过tika库将各种类型格式解析成字符串。
应用场景包括全文搜索，日志分析，运维监控，安全分析。

<?php
require 'vendor/autoload.php';
use Elasticsearch\ClientBuilder;

//创建实例
$client = ClientBuilder::create()->setHosts([
  [
    'host'   => '<HOST>',
    'port'   => '9200',
    'scheme' => 'http',
    'user'   => '<USER NAME>',
    'pass'   => '<PASSWORD>'
  ]
])->setConnectionPool('\Elasticsearch\ConnectionPool\SimpleConnectionPool', [])
  ->setRetries(10)->build();
//创建索引
$indexParams = [

        'index' => 'test', //索引名称

        'body' => [

            'settings'=> [ //配置

                'number_of_shards'=> 3,//主分片数

                'number_of_replicas'=> 1 //主分片的副本数

            ],

            'mappings'=> [  //映射

                '_default_' => [ //默认配置，每个类型缺省的配置使用默认配置

                    '_all'=>[   //  关闭所有字段的检索

                        'enabled' => 'false'

                    ],

                    '_source'=>[   //  存储原始文档

                        'enabled' => 'true'

                    ],

                    'properties'=> [ //配置数据结构与类型

                        'name'=> [ //字段1

                            'type'=>'string',//类型 string、integer、float、double、boolean、date

                            'index'=> 'analyzed',//索引是否精确值  analyzed not_analyzed

                        ],

                        'age'=> [ //字段2

                            'type'=>'integer',

                        ],

                        'sex'=> [ //字段3

                            'type'=>'string',

                            'index'=> 'not_analyzed',

                        ],

                    ]

                ],

                'my_type' => [

                    'properties' => [

                        'phone'=> [

                            'type'=>'string',

                        ],                           

                    ]

                ],

            ],

        ]

    ];
$indexResponse = $client->index($indexParams);
print_r($indexResponse);
//搜索内容
$searchParams = [
  'index'  => 'my_index',
  'type'   => 'my_type',
  'body'   => [
    'query' => [
      'match' => [
        'testField' => 'abc'
      ]
    ]
  ],
  'client' => [
    'timeout'         => 10,
    'connect_timeout' => 10
  ]
];
$searchResponse = $client->search($searchParams);
print_r($searchResponse);

$params = [

        'index' => 'test'

    ];
    $res = $client->delete($params);

    //查看mapping
    $params = [

        'index' => 'test'

    ];
    $res = $client->getMapping($params);
    //修改mapping修改，Mappings的API必须要指明type，且只能添加，不能修改已有的属性。
    $params = [          

        'index' => 'test',

        'type' => 'my_type',

        'body' => [

            'my_type' => [

                'properties' => [

                    'idcard' => [

                        'type' => 'integer'

                    ]

                ]

            ]

        ]

    ];

 

    $res = $client->putMapping($params);

elasticsearch7默认不在支持指定索引类型，默认索引类型是_doc，如果想改变，则配置include_type_name: true 即可(这个没有测试，官方文档说的，无论是否可行，建议不要这么做，因为elasticsearch8后就不在提供该字段)。官方文档：https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html

//创建索引product_info
PUT /product_info?include_type_name=true
{
  "settings": {
    "number_of_shards": 5,
    "number_of_replicas": 1
  },
  "mappings": {
    "products": {
      "properties": {
        "productName": {"type": "text","analyzer": "ik_smart"},
        "annual_rate":{"type":"keyword"},
        "describe": {"type": "text","analyzer": "ik_smart"}
      }
    }
  }
}

PUT /product_info1
{
  "settings": {
    "number_of_shards": 5,
    "number_of_replicas": 1
  },
  "mappings": {
    
      "properties": {
        "productName": {"type": "text","analyzer": "ik_smart"},
        "annual_rate":{"type":"keyword"},
        "describe": {"type": "text","analyzer": "ik_smart"}
      }
    
  }
}

插入文档
POST /product_info/products/_bulk
{"index":{}}
{"productName":"大健康天天理财","annual_rate":"3.2200%","describe":"180天定期理财，最低20000起投，收益稳定，可以自助选择消息推送"}
{"index":{}}
{"productName":"西部通宝","annual_rate":"3.1100%","describe":"90天定投产品，最低10000起投，每天收益到账消息推送"}
{"index":{}}
{"productName":"安详畜牧产业","annual_rate":"3.3500%","describe":"270天定投产品，最低40000起投，每天收益立即到账消息推送"}
{"index":{}}
{"productName":"5G设备采购月月盈","annual_rate":"3.1200%","describe":"90天定投产品，最低12000起投，每天收益到账消息推送"}
{"index":{}}
{"productName":"新能源动力理财","annual rate":"3.0100%","describe":"30天定投产品推荐，最低8000起投，每天收益会消息推送"}
{"index":{}}
{"productName":"微贷赚","annual_rate":"2.7500%","describe":"热门短期产品，3天短期，无须任何手续费用，最低500起投，通过短信提示获取收益消息"}

POST /product_info1/_bulk
{"index":{}}
{"productName":"大健康天天理财","annual_rate":"3.2200%","describe":"180天定期理财，最低20000起投，收益稳定，可以自助选择消息推送"}
{"index":{}}
{"productName":"西部通宝","annual_rate":"3.1100%","describe":"90天定投产品，最低10000起投，每天收益到账消息推送"}
{"index":{}}
{"productName":"安详畜牧产业","annual_rate":"3.3500%","describe":"270天定投产品，最低40000起投，每天收益立即到账消息推送"}
{"index":{}}
{"productName":"5G设备采购月月盈","annual_rate":"3.1200%","describe":"90天定投产品，最低12000起投，每天收益到账消息推送"}
{"index":{}}
{"productName":"新能源动力理财","annual rate":"3.0100%","describe":"30天定投产品推荐，最低8000起投，每天收益会消息推送"}
{"index":{}}
{"productName":"微贷赚","annual_rate":"2.7500%","describe":"热门短期产品，3天短期，无须任何手续费用，最低500起投，通过短信提示获取收益消息"}

搜索查询
GET /product_info1/_search
{
  "query": {
    "range": {
      "annual_rate": {
        "gte": "3.0000%",
        "lte": "3.1300%"
      }
    }
  }
}

删除索引
DELETE /product_info

运维，查看极差运行情况
GET /_cluster/health
索引状态查询
GET /_cat/indices  

命令	说明
GET /_cat/health?v	查看集群的健康状态。集群状态包括green、red、yellow，各状态的具体说明请参见查看集群健康状态。
GET /_cluster/health?pretty=true	查看集群的健康状态。pretty=true表示格式化输出。您也可以添加其他查询参数，例如：
level=indices：显示索引状态。
level=shards：显示分片信息。
GET /_cluster/stats	查看集群的系统信息。包括CPU、JVM等信息。
GET /_cluster/state	查看集群的详细信息。包括节点、分片等信息。
GET /_cluster/pending_tasks	查看集群中堆积的任务。
GET /_cluster/settings	查看集群设置。
GET /_cat/master?v	查看集群中Master节点的信息。
GET /_cat/nodes?v	查看集群中各个节点的当前状态。包括节点CPU使用率、HeapMemory使用率、负载情况等。
GET /_cat/nodeattrs?v	查看单节点的自定义属性。
GET /_nodes/stats?pretty=true	查看节点状态。
GET /_nodes/process	查看节点的进程信息。
GET /_nodes/hot_threads	查看高消耗的线程所执行的任务。
GET /_nodes/<nodeip>/jvm,process,os	查看指定节点的JVM、进程和操作系统信息。
GET _cat/plugins?v	查看各节点的插件信息。
GET /_cat/thread_pool?v	查看各节点的线程池统计信息。包括线程池的类型、活跃线程数、任务队列大小等。
GET /_cat/shards?v	查看集群中各分片的详细情况。包括索引名称、分片编号、是主分片还是副分片、分片的当前状态（对于分配失败的分片会有失败原因）、doc数量、磁盘占用情况等。您也可以指定index，查看该index的分片信息（GET _cat/shards/<index>?v）。
GET /_cat/allocation?v	查看集群中每个节点的分片分配情况。
GET /_cat/recovery?v	查看集群中每个分片的恢复过程。
GET /_cat/indices?v	查看集群中所有索引的详细信息。包括索引的健康度、状态、分片数和副本数、包含的文档数等。您也可以查看指定索引的信息（GET _cat/indices/<index>?v）。
GET /_cat/aliases?v	查看集群中所有aliases（索引别名）的信息。包括aliases对应的索引、路由配置等。
GET /_mapping	查看集群中所有索引的Mapping。
GET /<index>/<type>/_mapping	查看指定索引的Mapping。
GET /_cat/count?v	查看集群中的文档数量。您也可以指定index，查看该index的文档数量（GET _cat/count/<index>?v）。
GET /<index>/<type>/<id>	查看文档中的数据。
GET _snapshot/_all	查看所有快照。
GET _snapshot/<snapshot_name>/_status	查看指定快照的进度。

elasticsearch	redisearch	xunsearch
大项目	大项目	小项目，十万数据以下
基于磁盘的缓存选项,大小推荐 2G 以上的内存空间，并且需要额外的磁盘空间做持久化存储;HTTP协议	Redisearch 是一个高效，功能完备的内存存储的高性能全文检索组件，十分适合应用在数据量适中，内存和存储空间有限的环境。借助数据同步手段，我们可以很方便的将redisearch 结合到现有的数据存储中，进而向产品提供全文检索，自动补全等服务优化功能。	小项目，十万数据以下