最新ES数据库学，一个月成功收割腾讯、百度、美团、网易offer

2401_84159911

于 2024-05-12 14:19:09 发布

阅读量675

点赞数 18

分类专栏：程序员文章标签：大数据面试学习

本文链接：https://blog.csdn.net/2401_84159911/article/details/138754777

版权

程序员专栏收录该内容

188 篇文章 1 订阅

订阅专栏

网上学习资料一大堆，但如果学到的知识不成体系，遇到问题时只是浅尝辄止，不再深入研究，那么很难做到真正的技术提升。

需要这份系统化资料的朋友，可以戳这里获取

一个人可以走的很快，但一群人才能走的更远！不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人，都欢迎加入我们的的圈子（技术交流、学习资源、职场吐槽、大厂内推、面试辅导），让我们一起学习成长！

结果：

epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1684116543 02:09:03  my-application yellow       1         1     15  15    0    0        2             0                  -                 88.2%

cluster ，集群名称
status，集群状态 green代表健康；yellow代表分配了所有主分片，但至少缺少一个副本，此时集群数据仍旧完整；red代表部分主分片不可用，可能已经丢失数据。
node.total，代表在线的节点总数量
node.data，代表在线的数据节点的数量
shards， active_shards 存活的分片数量
pri，active_primary_shards 存活的主分片数量 正常情况下 shards的数量是pri的两倍。
relo， relocating_shards 迁移中的分片数量，正常情况为 0
init， initializing_shards 初始化中的分片数量 正常情况为 0
unassign， unassigned_shards 未分配的分片 正常情况为 0
pending_tasks，准备中的任务，任务指迁移分片等 正常情况为 0
max_task_wait_time，任务最长等待时间
active_shards_percent，正常分片百分比 正常情况为 100%

1.3、查看es节点信息

GET /_cat/nodes?v

1.4、查看es指定节点信息-node-1

GET /_nodes/nodeName?pretty=true
示例：GET /_nodes/node-1?pretty=true

2、索引命令

2.1、查看es中所有索引

GET /_cat/indices?v

结果：

health status index                   uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   index_1     rerwerwrewrwrwe  20   1        208            0      1.1mb        609.8kb
green  open   index_2     eewfdsffhwehfoeif3  30   1          4            1    222.4kb        111.2kb

health:  green代表健康；yellow代表分配了所有主分片，但至少缺少一个副本，此时集群数据仍旧完整；red代表部分主分片不可用，可能已经丢失数据。
pri：primary缩写，主分片数量
rep：副分片数量
docs.count： Lucene 级别的文档数量
docs.deleted： 删除的文档
store.size：全部分片大小（包含副本）
pri.store.size：主分片大小

2.2、新建索引

PUT /test

成功返回

{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "test"
}

demo1:

#自定义类型 type
PUT /test
{
  "mappings": {
    "properties": {
      "info": {
        "type": "text",
        "analyzer": "ik_smart"  #analyzer分词器选择
      },
      "email": {
        "type": "keyword", #字段类型
        "index": false
      },
      "name": {
        "properties": {
          "firstName": {
            "type": "keyword"
          },
          "lastName": {
            "type": "keyword"
          }
        }
      }
    }
  }
}

demo2

#-----------用户user-----------------
#不自定义类型
PUT /user

#不自定义类型 会默认配置 如字段类型 分片 以及id
PUT /user/_doc/1
{
  "name":"张三",
  "age":10,
  "sex":"男",
  "address":"江苏苏州"
}

GET /user/_search

#批量创建文档数据
POST _bulk
{"create":{"_index":"user", "_type":"_doc", "_id":2}}
{"id":2,"name":"李四","age":"20","sex":"男","address":"苏州园区"}
{"create":{"_index":"user", "_type":"_doc", "_id":3}}
{"id":3,"name":"王芳","age":"30","sex":"女","address":"园区华为"}
{"create":{"_index":"user", "_type":"_doc", "_id":4}}
{"id":4,"name":"赵六","age":"40","sex":"女","address":"华为汽车"}

#批量获取文档数据
docs : 文档数组参数
_index : 指定index
_type : 指定type
_id : 指定id
_source : 指定要查询的字段
--------------------------------------------
GET _mget
{
  "docs": [
    {
      "_index": "user",
      "_type": "_doc",
      "_id": 1
    },
    {
      "_index": "user",
      "_type": "_doc",
      "_id": 2
    }
  ]
}

GET /user/_mget
{
  "docs": [
    {
      "_type": "_doc",
      "_id": 1
    },
    {
      "_type": "_doc",
      "_id": 2
    }
  ]
}

GET /user/_doc/_mget
{
  "docs": [
    {
      "_id": 1
    },
    {
      "_id": 2
    }
  ]
}

GET /user/_mget
{
  "docs": [
    {
      "_id": 1
    },
    {
      "_id": 2
    },
    {
      "_id": 3
    },
    {
      "_id": 4
    }
  ]
}

#批量修改文档数据，不存在则创建，存在则替换
POST _bulk
{"index":{"_index":"user", "_type":"_doc", "_id":2}}
{"id":2,"name":"李四","age":"20","sex":"男","address":"苏州园区"}
{"index":{"_index":"user", "_type":"_doc", "_id":3}}
{"id":3,"name":"王芳","age":"30","sex":"女","address":"园区华为"}
{"create":{"_index":"user", "_type":"_doc", "_id":4}}
{"id":4,"name":"赵六","age":"40","sex":"女","address":"华为汽车"}

#批量修改update
POST _bulk
{"update":{"_index":"user","_type":"_doc","_id":2}}
{"doc":{"address":"苏州园区XX"}}
{"update":{"_index":"user","_type":"_doc","_id":3}}
{"doc":{"address":"园区华为XX"}}

#批量删除
POST _bulk
{"delete":{"_index":"user", "_type":"_doc", "_id":3}}
{"delete":{"_index":"user", "_type":"_doc", "_id":4}}

2.3、删除索引,“acknowledged”:true表示删除成功

DELETE /test

2.4、查看索引的统计信息

GET /_stats?pretty

2.5、修改索引

倒排索引结构，一旦数据结构改变（比如改变了分词器），就需要重新创建倒排索引，这简直是灾难。因此索引库一旦创建，无法修改mapping。

然无法修改mapping中已有的字段，但是却允许添加新的字段到mapping中，因为不会对倒排索引产生影响。

方法1：覆盖PUT

PUT first/_doc/1
{
  "name":"林",
  "age":18,
  "from":"gu",
  "desc":"念能力,学生，暗属性",
  "tags":["能力者","男","暗"]
}

方法2：更新 POST

使用 POST 命令，在 id 后面跟 _update ，要修改的内容放到 doc 文档(属性)中即可。

POST first/_doc/3/_update 
{
  "doc": {
    "name":"愚者",
    "desc":"塔罗",
    "tags":["魔法","超能力","塔罗"]
  }
}

2.6、插入数据

PUT first/_doc/1
{
  "name":"林",
  "age":18,
  "from":"gu",
  "desc":"念能力",
  "tags":["能力者","学院","男"]
}

PUT first/_doc/2
{
  
  "name":"宝儿姐",
  "age":22,
  "from":"gu", 
  "desc":"道法",
  "tags":["道", "驱魔","女"]
}

2.7、查看索引

2.7.1、查看指定索引

GET /first?pretty  #查看结构

GET /first/_search #查看表内容 select * from first
or
GET /first/_search
{
  "query": {
    "match_all": {}
  }
}

{
  "took" : 787,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "first",
        "_type" : "chunsheng",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "春生",
          "age" : 18,
          "from" : "gu",
          "desc" : "念能力,学生，暗属性",
          "tags" : [
            "能力者",
            "男",
            "暗"
          ]
        }
      },
      {
        "_index" : "first",
        "_type" : "chunsheng",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "name" : "愚者",
          "age" : 22,
          "from" : "gu",
          "desc" : "塔罗",
          "tags" : [
            "魔法",
            "超能力",
            "塔罗"
          ]
        }
      },
      {
        "_index" : "first",
        "_type" : "chunsheng",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "name" : "宝儿姐",
          "age" : 18,
          "from" : "sheng",
          "desc" : "道法",
          "tags" : [
            "长生",
            "超能力",
            "道法"
          ]
        }
      }
    ]
  }
}

2.7.2、简单查询

GET first/_search?q=from:gu
#使用下面的查询，结果一样 查询条件添加到 match 
GET /first/_search
{
  "query": {
    "match": {
      "from": "gu"
    }
  }
}

结果

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 0.4700036,
    "hits" : [
      {
        "_index" : "first",
        "_type" : "chunsheng",
        "_id" : "3",
        "_score" : 0.4700036,
        "_source" : {
          "name" : "愚者",
          "age" : 22,
          "from" : "gu",
          "desc" : "塔罗",
          "tags" : [
            "魔法",
            "超能力",
            "塔罗"
          ]
        }
      },
      {
        "_index" : "first",
        "_type" : "chunsheng",
        "_id" : "1",
        "_score" : 0.4700036,
        "_source" : {
          "name" : "春生",
          "age" : 18,
          "from" : "gu",
          "desc" : "念能力,学生，暗属性",
          "tags" : [
            "能力者",
            "男",
            "暗"
          ]
        }
      }
    ]
  }
}

2.7.3、控制返回结果

 _source 来控制仅返回

GET /first/_search
{
  "query": {
    "match_all": {}
  },
      "_source": ["tags","name"]
}

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "first",
        "_type" : "chunsheng",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "name" : "宝儿姐",
          "tags" : [
            "长生",
            "超能力",
            "道法"
          ]
        }
      },
      {
        "_index" : "first",
        "_type" : "chunsheng",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "name" : "愚者",
          "tags" : [
            "魔法",
            "超能力",
            "塔罗"
          ]
        }
      },
      {
        "_index" : "first",
        "_type" : "chunsheng",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "春生",
          "tags" : [
            "能力者",
            "男",
            "暗"
          ]
        }
      }
    ]
  }
}

2.7.4、排序 sort

desc[倒序] or asc[正序]

GET /first/_search
{
  "query": {
    "match_all": {}
  },
  "_source": ["age","name"],
  "sort": [
    {
      "age": {
        "order": "asc"
      }
    }
  ]
}

结果：

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "first",
        "_type" : "chunsheng",
        "_id" : "2",
        "_score" : null,
        "_source" : {
          "name" : "宝儿姐",
          "age" : 18
        },
        "sort" : [
          18
        ]
      },
      {
        "_index" : "first",
        "_type" : "chunsheng",
        "_id" : "1",
        "_score" : null,
        "_source" : {
          "name" : "春生",
          "age" : 18
        },
        "sort" : [
          18
        ]
      },
      {
        "_index" : "first",
        "_type" : "chunsheng",
        "_id" : "3",
        "_score" : null,
        "_source" : {
          "name" : "愚者",
          "age" : 22
        },
        "sort" : [
          22
        ]
      }
    ]
  }
}

2.7.5、分页查询 from size

GET /first/_search
{
  "query": {
    "match_all": {}
  },
  "_source": ["age","name"],
  "sort": [
    {
      "age": {
        "order": "asc"
      }
    }
  ],
    "from":0, #第n条开始
    "size":1 #返回多少条数据
}

2.7.6、布尔查询

MUST

“select age,name where first where from=gu and age=18”

GET /first/_search
{
  "query": {
    "bool": {
      "must": [
        {"match": {
          "from": "gu"
        }
        },
        {"match": {
          "age": "18"}
        }
      ]
    }
  },
  "_source": ["age","name"],
  "sort": [
    {
      "age": {
        "order": "asc"
      }
    }
  ]
}

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "first",
        "_type" : "chunsheng",
        "_id" : "1",
        "_score" : null,
        "_source" : {
          "name" : "春生",
          "age" : 18
        },
        "sort" : [
          18
        ]
      }
    ]
  }
}

shoud

“select age,name where first where from=gu or age=18”

GET /first/_search
{
  "query": {
    "bool": {
      "should": [
        {"match": {
          "from": "gu"
        }
        },
        {"match": {
          "age": "18"}
        }
      ]
    }
  },
  "_source": ["age","name","from"],
  "sort": [
    {
      "age": {
        "order": "asc"
      }
    }
  ]
}

most_not

“select age,name where first where from!=gu and age!=18”

GET /first/_search
{
  "query": {
    "bool": {
      "must_not": [
        {"match": {
          "from": "gu"
        }
        },
        {"match": {
          "age": "22"}
        }
      ]
    }
  },
  "_source": ["age","name","from"],
  "sort": [
    {
      "age": {
        "order": "asc"
      }
    }
  ]
}

filter 过滤查询

过滤条件的范围用 range 表示

gt 表示大于
gte 表示大于等于
lt 表示小于
lte 表示小于等于

“select age,name where first where from=gu and age>=18 and age<=20”

GET /first/_search
{
  "query": {
    "bool": {
      "must": [
        {"match": {
          "from": "gu"
        }
        }
      ],
      "filter": [
        {"range": {
          "age": {
            "gte": 18,
            "lte": 20
          }
        }}
      ]
    }
  },
  "_source": ["age","name","from"],
  "sort": [
    {
      "age": {
        "order": "asc"
      }
    }
  ]
}

2.7.7、短语检索【可用数组中检索关键字】

模糊查找

GET /first/_search
{
  "query": {
    "match": {
      "tags": "暗 魔"  #空格分开
    }
    } 
}

结果

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0732633,
    "hits" : [
      {
        "_index" : "first",
        "_type" : "chunsheng",
        "_id" : "1",
        "_score" : 1.0732633,
        "_source" : {
          "name" : "春生",
          "age" : 18,
          "from" : "gu",
          "desc" : "念能力,学生，暗属性",
          "tags" : [
            "能力者",
            "男",
            "暗"
          ]
        }
      },
      {
        "_index" : "first",
        "_type" : "chunsheng",
        "_id" : "3",
        "_score" : 0.9403362,
        "_source" : {
          "name" : "愚者",
          "age" : 22,
          "from" : "gu",
          "desc" : "塔罗",
          "tags" : [
            "魔法",
            "超能力",
            "塔罗"
          ]
        }
      }
    ]
  }
}

精准查找

GET /first/_search
{
  "query": {
    "match_phrase": {
      "tags": "魔法"
    }
    } 
}

2.7.8 、term查询

term查询是直接通过倒排索引指定的词条，也就是精确查找。

term和match的区别:

match是经过分析(analyer)的，也就是说，文档是先被分析器处理了，根据不同的分析器，分析出的结果也会不同，在会根据分词结果进行匹配。
term是不经过分词的，直接去倒排索引查找精确的值。

2.7.8.1、字段是否存在:exist

GET /first/_search
{
  "query": {
    "exists": {
      "field": "from_"
    }
  }
  
}

2.7.8.2、id查询:ids

ids 即对id查找

GET /first/_search
{
  "query": {
    "ids": {
      "values": [3, 1]
    }
  }
}

2.7.8.3、前缀:prefix

通过前缀查找某个字段

GET /first/_search
{
  "query": {
    "prefix": {
      "desc": {
        "value": "道"
      }
    }
  }
}

select * from first where match(desc,"^道")

2.7.8.4、分词匹配:term

前文最常见的根据分词查询

GET /first/_search
{
  "query": {
    "terms": {
      "tags": "长生"
    }
  }
}

select * from first where "长生" in tags

2.7.8.5、多个分词匹配:terms

按照读个分词term匹配，它们是or的关系

GET /test-dsl-term-level/_search
{
  "query": {
    "terms": {
      "programming\_languages": ["php","c++"]
    }
  }
}

2.7.8.6、通配符:wildcard

GET /first/_search
{
  "query": {
    "wildcard": {
      "name": {
        "value": "儿*",
        "boost": 1.0,
        "rewrite": "constant_score"
      }
    }
  }
}

SELECT  * from accesslog a WHERE match(host,'儿');

模糊匹配:fuzzy

官方文档对模糊匹配：编辑距离是将一个术语转换为另一个术语所需的一个字符更改的次数。这些更改可以包括：

更改字符（box→ fox）
删除字符（black→ lack）
插入字符（sic→ sick）
转置两个相邻字符（act→ cat）

网上学习资料一大堆，但如果学到的知识不成体系，遇到问题时只是浅尝辄止，不再深入研究，那么很难做到真正的技术提升。

需要这份系统化资料的朋友，可以戳这里获取

2.7.8.4、分词匹配:term

前文最常见的根据分词查询

GET /first/_search
{
  "query": {
    "terms": {
      "tags": "长生"
    }
  }
}

select * from first where "长生" in tags

2.7.8.5、多个分词匹配:terms

按照读个分词term匹配，它们是or的关系

GET /test-dsl-term-level/_search
{
  "query": {
    "terms": {
      "programming\_languages": ["php","c++"]
    }
  }
}

2.7.8.6、通配符:wildcard

GET /first/_search
{
  "query": {
    "wildcard": {
      "name": {
        "value": "儿*",
        "boost": 1.0,
        "rewrite": "constant_score"
      }
    }
  }
}

SELECT  * from accesslog a WHERE match(host,'儿');

模糊匹配:fuzzy

官方文档对模糊匹配：编辑距离是将一个术语转换为另一个术语所需的一个字符更改的次数。这些更改可以包括：

更改字符（box→ fox）
删除字符（black→ lack）
插入字符（sic→ sick）
转置两个相邻字符（act→ cat）

[外链图片转存中…(img-UMLJzdqT-1715494708623)]
[外链图片转存中…(img-qmMtIUjE-1715494708624)]

网上学习资料一大堆，但如果学到的知识不成体系，遇到问题时只是浅尝辄止，不再深入研究，那么很难做到真正的技术提升。

需要这份系统化资料的朋友，可以戳这里获取

2401_84159911

关注

18
点赞
踩
23

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录