ElasticSearch--入门操作（CRUD）

最新推荐文章于 2022-08-08 10:49:18 发布

小毛贼_哪里逃

最新推荐文章于 2022-08-08 10:49:18 发布

阅读量221

点赞数

分类专栏： ElasticSearch

本文链接：https://blog.csdn.net/qq_28497823/article/details/104714492

版权

ElasticSearch 专栏收录该内容

20 篇文章 0 订阅

订阅专栏

基本概念

1 Node 与 Cluster

Elastic 本质上是一个分布式数据库，允许多台服务器协同工作，每台服务器可以运行多个 Elastic 实例。单个 Elastic 实例称为一个节点（node）。一组节点构成一个集群（cluster）。

查看当前集群的健康状态：

GET _cluster/health

{
  "cluster_name": "elasticsearch",
  "status": "yellow",
  "timed_out": false,
  "number_of_nodes": 1,
  "number_of_data_nodes": 1,
  "active_primary_shards": 1,
  "active_shards": 1,
  "relocating_shards": 0,
  "initializing_shards": 0,
  "unassigned_shards": 1,
  "delayed_unassigned_shards": 0,
  "number_of_pending_tasks": 0,
  "number_of_in_flight_fetch": 0,
  "task_max_waiting_in_queue_millis": 0,
  "active_shards_percent_as_number": 50
}

2 Index

Elastic 会索引所有字段，经过处理后写入一个倒排索引（Inverted Index，也叫反向索引）。查找数据的时候，直接查找该索引。所以，Elastic 数据管理的顶层单位就叫做 Index（索引）。它是单个数据库的同义词。每个 Index （即数据库）的名字必须是小写。

查看当前节点的所有 Index：

GET _cat/indices?v

health status index   uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   .kibana BVGnJPzQSVCDgVY0vtbMmw   1   1          1            0      3.1kb          3.1kb

新建Index

PUT weather
返回
{
  "acknowledged": true,//操作成功
  "shards_acknowledged": true
}

创建时指定元数据

PUT test_index
{
  "settings": {
    "number_of_replicas": 1,
    "number_of_shards": 1
  }, 
  "mappings": {
    "test_type":{
      "properties": {
        "name":{
          "type": "text"
        }
      }
    }
  }
}
返回：
{
  "acknowledged": true,
  "shards_acknowledged": true
}

修改index

PUT test_index/_settings
{
  "number_of_replicas": 1
}
number_of_shards不可以修改

删除 Index

DELETE weather

{
  "acknowledged": true
}

GET _cat/indices?v
health status index   uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   .kibana BVGnJPzQSVCDgVY0vtbMmw   1   1          1            0      3.1kb          3.1kb
附：

DELETE _all：删除所有（在elasticsearch.yml中设置action.destructive_requires:true，就不可以使用此方法删除所有索引了）

DELETE index1,index2：删除index1和index2

DELETE index*：删除以index开头的索引

3 Type

Document 可以分组，比如weather这个 Index 里面，可以按城市分组（北京和上海），也可以按气候分组（晴天和雨天）。这种分组就叫做 Type，它是虚拟的逻辑分组，用来过滤 Document。

不同的 Type 应该有相似的结构（schema），举例来说，id字段不能在这个组是字符串，在另一个组是数值。这是与关系型数据库的表的一个区别。性质完全不同的数据（比如products和logs）应该存成两个 Index，而不是一个 Index 里面的两个 Type（虽然可以做到）。

Elastic 6.x 版只允许每个 Index 包含一个 Type，7.x 版将会彻底移除 Type。

列出每个 Index 所包含的 Type:

GET _mapping/?pretty=true
返回：
{
  ".kibana": {
    "mappings": {
      "server": {
        "properties": {
          "uuid": {
            "type": "keyword"
          }
        }
      },
      "index-pattern": {
        "properties": {
          "fieldFormatMap": {
            "type": "text"
          },
          "fields": {
            "type": "text"
          },
          "intervalName": {
            "type": "text"
          },
          "notExpandable": {
            "type": "boolean"
          },
          "sourceFilters": {
            "type": "text"
          },
          "timeFieldName": {
            "type": "text"
          },
          "title": {
            "type": "text"
          }
        }
      },
      "config": {
        "properties": {
          "buildNum": {
            "type": "keyword"
          }
        }
      }
    }
  }
}

4 Document

Index 里面单条的记录称为 Document（文档）。许多条 Document 构成了一个 Index。Document 使用 JSON 格式表示，同一个 Index 里面的 Document，不要求有相同的结构（scheme），但是最好保持相同，这样有利于提高搜索效率。

新增document/全量更新（替换document）：指定_id

全量更新: 记录的Id不变，但是版本（version）加1，操作类型（result）从created变成updated，created字段变成false。

其内部是先获取旧数据，然后发送逻辑删除命令，最后再发送新增命令。因为在获取旧数据和es完成新增操作的时间不可靠，所以会增加并发冲突，故不推荐用此方法更新document。

PUT accounts/person/1
{
  "user": "张三",
  "title": "工程师",
  "desc": "数据库管理"
}
返回：
{
  "_index": "accounts",
  "_type": "person",
  "_id": "1",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "created": true
}
新增document：不指定_id
POST accounts/person
{
  "user": "张三",
  "title": "工程师",
  "desc": "数据库管理"
}
返回：
{
  "_index": "accounts",
  "_type": "person",
  "_id": "AXCz0GZWNekRbwyPL8i2",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "created": true
}
查看document

http协议里get请求是不能带request body的，但是ES认为使用GET请求语义更好，所以ES支持GET+request body的方式（大部分服务器和浏览器也都支持），如果遇到不支持的情况，把GET请求更改为POST请求即可。
GET accounts/person/1/?pretty=true //pretty=true表示以易读的格式返回。

返回：
{
  "_index": "accounts",
  "_type": "person",
  "_id": "1",
  "_version": 1,
  "found": true,//查询成功
  "_source": {//原始记录
    "user": "张三",
    "title": "工程师",
    "desc": "数据库管理"
  }
}
删除document

删除操作并非立即物理删除，而是先进行逻辑删除，当存储空间不足等问题发生的时候，才会进行真正的物理删除。

验证方式：

新增一个document，指定id为1，其_version为1
删除此document，其_version变为2
再次新增一个document并且指定id为1，会发现其_version为3
DELETE accounts/person/1
返回：
{
  "found": true,
  "_index": "accounts",
  "_type": "person",
  "_id": "1",
  "_version": 6,
  "result": "deleted",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  }
}

GET accounts/person/1
返回：
{
  "_index": "accounts",
  "_type": "person",
  "_id": "1",
  "found": false
}
更新document/partial update

部分更新，不需要全部的json数据，_version会+1，result会变为updated

优势：

所有的查询，修改，回写都发生在es内部，节省了网络数据的传输开销，提升了性能
减少了查询和修改中的时间间隔，有效减少并发冲突

POST /accounts/person/1/_update
{
  "doc": {
    "user": "张三san",
    "title": "工程师",
    "desc": "数据库管理"
  }
}
返回：
{
  "_index": "accounts",
  "_type": "person",
  "_id": "1",
  "_version": 2,
  "result": "updated",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  }
}

GET accounts/person/1
返回：
{
  "_index": "accounts",
  "_type": "person",
  "_id": "1",
  "_version": 2,
  "found": true,
  "_source": {
    "user": "张三san",
    "title": "工程师",
    "desc": "数据库管理"
  }
}

列出所有document

GET accounts/person/_search
{
  "query": {
    "match_all": {}
  }
}
或者
GET accounts/person/_search
返回：
{
  "took": 1,//操作的耗时（单位为毫秒）
  "timed_out": false,//是否超时
  "_shards": {//所使用到的shard
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {//命中的记录
    "total": 2,//返回记录数
    "max_score": 1,//最高的匹配程度
    "hits": [//返回的记录组成的数组
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "AXCz0GZWNekRbwyPL8i2",
        "_score": 1,//匹配的程序，默认是按照这个字段降序排列
        "_source": {
          "user": "张三",
          "title": "工程师",
          "desc": "数据库管理"
        }
      },
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "1",
        "_score": 1,
        "_source": {
          "user": "张三san",
          "title": "工程师",
          "desc": "数据库管理"
        }
      }
    ]
  }
}

5 全文搜索（full text search）

会将输入的字符串进行分词，然后去倒排索引里去一一匹配，只要能匹配上任意一个分词后的词语，就可以作为结果返回

GET accounts/person/_search
{
  "from": 0, //指定位移
  "size": 1, //设置返回的结果数量
  "query": {
    "match": {
      "user": "san"
    }
  }
}
返回：
{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.25316024,
    "hits": [
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "1",
        "_score": 0.25316024,
        "_source": {
          "user": "张三san",
          "title": "工程师",
          "desc": "数据库管理"
        }
      }
    ]
  }
}

6 短语搜索（phrase search）

输入的字符串必须在指定的字段文本中，必须包含一模一样的字符串才算匹配，才能作为结果返回，和全文搜索正好相反

短语搜索
GET accounts/person/_search
{
  "query": {
    "match": {
      "desc": "数据管理"
    }
  }
}
返回
{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

使用全文搜索：
GET accounts/person/_search
{
  "query": {
    "match": {
      "desc": "数据管理"
    }
  }
}
返回：
{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 5,
    "max_score": 1.1299736,
    "hits": [
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "5",
        "_score": 1.1299736,
        "_source": {
          "user": "lisi",
          "age": 50,
          "salary": 6000,
          "title": "业务员",
          "desc": "数据库管理"
        }
      },
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "1",
        "_score": 1.1299736,
        "_source": {
          "user": "zeng",
          "age": 30,
          "salary": 20000,
          "title": "工程师",
          "desc": "数据库管理"
        }
      },
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "3",
        "_score": 1.1299736,
        "_source": {
          "user": "jim",
          "age": 35,
          "salary": 17000,
          "title": "工程师",
          "desc": "数据库管理"
        }
      },
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "2",
        "_score": 0.71613276,
        "_source": {
          "user": "sam",
          "age": 20,
          "salary": 15000,
          "title": "工程师",
          "desc": "数据库管理"
        }
      },
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "4",
        "_score": 0.71613276,
        "_source": {
          "user": "zhangsan",
          "age": 50,
          "salary": 5000,
          "title": "业务员",
          "desc": "数据库管理"
        }
      }
    ]
  }
}

7 逻辑运算搜索

现有记录：
GET accounts/person/_search
返回：
{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "AXCz0GZWNekRbwyPL8i2",
        "_score": 1,
        "_source": {
          "user": "张三",
          "title": "工程师",
          "desc": "数据库管理"
        }
      },
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "2",
        "_score": 1,
        "_source": {
          "user": "王五 ",
          "title": "工程师",
          "desc": "数据库管理"
        }
      },
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "1",
        "_score": 1,
        "_source": {
          "user": "张三李四",
          "title": "工程师",
          "desc": "数据库管理"
        }
      }
    ]
  }
}

and搜索：
GET accounts/person/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "user": "四"
          }
        },
        {
          "match": {
            "user": "三"
          }
        }
      ],
      "must_not": [
        {"match": {
          "user": "五"
        }}
      ]
    }
  }
}
返回：
{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.5753642,
    "hits": [
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "1",
        "_score": 0.5753642,
        "_source": {
          "user": "张三李四",
          "title": "工程师",
          "desc": "数据库管理"
        }
      }
    ]
  }
}

or搜索：
GET accounts/person/_search
{
  "query": {
    "match": {
      "user": "张三 李四"
    }
  }
}
返回：
{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1.1507283,
    "hits": [
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "1",
        "_score": 1.1507283,
        "_source": {
          "user": "张三李四",
          "title": "工程师",
          "desc": "数据库管理"
        }
      },
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "AXCz0GZWNekRbwyPL8i2",
        "_score": 0.51623213,
        "_source": {
          "user": "张三",
          "title": "工程师",
          "desc": "数据库管理"
        }
      }
    ]
  }
}

8 高亮搜索

GET accounts/person/_search
{
  "query": {
    "match": {
      "user": "sam"
    }
  },
  "highlight": {
    "pre_tags": [//高亮前缀标签
        "<a class='highlightClass' href='#'>"
      ],
    "post_tags": [//高亮后缀标签
        "</a>"
      ], 
    "fields": {//需要高亮的字段
      "user":{},
      "desc":{}
    }
  }
}
返回：
{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.6931472,
    "hits": [
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "2",
        "_score": 0.6931472,
        "_source": {
          "user": "sam",
          "age": 20,
          "salary": 15000,
          "title": "工程师",
          "desc": "数据库管理"
        },
        "highlight": {
          "user": [
            "<em>sam</em>"
          ]
        }
      }
    ]
  }
}

9 综合示例

现有数据：
GET accounts/person/_search
返回：
{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 5,
    "max_score": 1,
    "hits": [
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "5",
        "_score": 1,
        "_source": {
          "user": "lisi",
          "age": 50,
          "salary": 6000,
          "title": "业务员",
          "desc": "数据库管理"
        }
      },
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "2",
        "_score": 1,
        "_source": {
          "user": "sam",
          "age": 20,
          "salary": 15000,
          "title": "工程师",
          "desc": "数据库管理"
        }
      },
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "4",
        "_score": 1,
        "_source": {
          "user": "zhangsan",
          "age": 50,
          "salary": 5000,
          "title": "业务员",
          "desc": "数据库管理"
        }
      },
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "1",
        "_score": 1,
        "_source": {
          "user": "zeng",
          "age": 30,
          "salary": 20000,
          "title": "工程师",
          "desc": "数据库管理"
        }
      },
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "3",
        "_score": 1,
        "_source": {
          "user": "jim",
          "age": 35,
          "salary": 17000,
          "title": "工程师",
          "desc": "数据库管理"
        }
      }
    ]
  }
}

1 查找出title包含“工”字、salary在16000至20000之间的记录，并且按age倒序，只显示age、title和salary字段
GET accounts/person/_search
{
  "query": {
    "bool": {
      "must": [//查找条件
        {"match": {
          "title": "工"
        }}
      ],
      "filter": {//过滤条件
        "range": {
          "salary": {
            "gte": 16000,
            "lte": 200000
          }
        }
      }
    }
  }, 
  "sort": [//排序条件
    {
      "age": "desc"
    }
  ], 
  "_source": ["age","title","salary"]//指定显示字段
}
返回：
{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": null,
    "hits": [
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "3",
        "_score": null,
        "_source": {
          "title": "工程师",
          "age": 35
        },
        "sort": [
          35
        ]
      },
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "1",
        "_score": null,
        "_source": {
          "title": "工程师",
          "age": 30
        },
        "sort": [
          30
        ]
      },
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "2",
        "_score": null,
        "_source": {
          "title": "工程师",
          "age": 20
        },
        "sort": [
          20
        ]
      }
    ]
  }
}

注：搜索的时候可以指定超时时间，在指定的时间内，找到多少条结果就返回多少条结果，如：GET _search?timeout=10ms。单位ms为毫秒数，s为秒数,m为分钟数

filter和query的区别

filter：只是按照搜索条件搜索出来的数据（已计算出相关度，已排好序）进行过滤，对相关度分数没有影响，且自动cache最常用的filter的数据，所以性能最好
query：会计算每个document相对于搜索条件的相关度，并按照相关度排序，且不会cache结果

query的类型

match: match模糊匹配，先对输入进行分词，对分词后的结果进行查询，文档只要包含match查询条件的一部分就会被返回
term结构化字段查询，匹配一个值，且输入的值不会被分词器分词。
match_phase习语匹配，查询确切的phase，在对查询字段定义了分词器的情况下，会使用分词器对输入进行分词
等等

constant_score：只使用filter过滤数据

GET /accounts/person/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "age": 50
        }
      }
    }
  }
}
返回：
{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "5",
        "_score": 1,
        "_source": {
          "user": "lisi",
          "age": 50,
          "salary": 6000,
          "title": "业务员",
          "desc": "数据库管理",
          "tag": "girl"
        }
      },
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "9",
        "_score": 1,
        "_source": {
          "user": "lisi",
          "age": 50,
          "salary": 6000,
          "title": "业务员",
          "desc": "数据库管理"
        }
      },
      {
        "_index": "accounts",
        "_type": "person",
        "_id": "6",
        "_score": 1,
        "_source": {
          "user": "lisi",
          "age": 50,
          "salary": 6000,
          "title": "业务员",
          "desc": "数据库管理"
        }
      }
    ]
  }
}

小毛贼_哪里逃

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
ElasticSearch--入门操作（CRUD）

基本概念 1 Node 与 ClusterElastic 本质上是一个分布式数据库，允许多台服务器协同工作，每台服务器可以运行多个 Elastic 实例。单个 Elastic 实例称为一个节点（node）。一组节点构成一个集群（cluster）。查看当前集群的健康状态：GET _cluster/health{ "cluster_name": "elasticsear...
复制链接

扫一扫