-
基本概念
1 Node 与 Cluster
Elastic 本质上是一个分布式数据库,允许多台服务器协同工作,每台服务器可以运行多个 Elastic 实例。单个 Elastic 实例称为一个节点(node)。一组节点构成一个集群(cluster)。
查看当前集群的健康状态:
GET _cluster/health { "cluster_name": "elasticsearch", "status": "yellow", "timed_out": false, "number_of_nodes": 1, "number_of_data_nodes": 1, "active_primary_shards": 1, "active_shards": 1, "relocating_shards": 0, "initializing_shards": 0, "unassigned_shards": 1, "delayed_unassigned_shards": 0, "number_of_pending_tasks": 0, "number_of_in_flight_fetch": 0, "task_max_waiting_in_queue_millis": 0, "active_shards_percent_as_number": 50 }
2 Index
Elastic 会索引所有字段,经过处理后写入一个倒排索引(Inverted Index,也叫反向索引)。查找数据的时候,直接查找该索引。所以,Elastic 数据管理的顶层单位就叫做 Index(索引)。它是单个数据库的同义词。每个 Index (即数据库)的名字必须是小写。
- 查看当前节点的所有 Index:
GET _cat/indices?v health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open .kibana BVGnJPzQSVCDgVY0vtbMmw 1 1 1 0 3.1kb 3.1kb
- 新建Index
PUT weather 返回 { "acknowledged": true,//操作成功 "shards_acknowledged": true }
创建时指定元数据
PUT test_index { "settings": { "number_of_replicas": 1, "number_of_shards": 1 }, "mappings": { "test_type":{ "properties": { "name":{ "type": "text" } } } } } 返回: { "acknowledged": true, "shards_acknowledged": true }
- 修改index
PUT test_index/_settings { "number_of_replicas": 1 }
number_of_shards不可以修改
- 删除 Index
DELETE weather { "acknowledged": true } GET _cat/indices?v health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open .kibana BVGnJPzQSVCDgVY0vtbMmw 1 1 1 0 3.1kb 3.1kb
附:
DELETE _all:删除所有(在elasticsearch.yml中设置action.destructive_requires:true,就不可以使用此方法删除所有索引了)
DELETE index1,index2:删除index1和index2
DELETE index*:删除以index开头的索引
3 Type
Document 可以分组,比如weather
这个 Index 里面,可以按城市分组(北京和上海),也可以按气候分组(晴天和雨天)。这种分组就叫做 Type,它是虚拟的逻辑分组,用来过滤 Document。
不同的 Type 应该有相似的结构(schema),举例来说,id
字段不能在这个组是字符串,在另一个组是数值。这是与关系型数据库的表的一个区别。性质完全不同的数据(比如products
和logs
)应该存成两个 Index,而不是一个 Index 里面的两个 Type(虽然可以做到)。
Elastic 6.x 版只允许每个 Index 包含一个 Type,7.x 版将会彻底移除 Type。
列出每个 Index 所包含的 Type:
GET _mapping/?pretty=true 返回: { ".kibana": { "mappings": { "server": { "properties": { "uuid": { "type": "keyword" } } }, "index-pattern": { "properties": { "fieldFormatMap": { "type": "text" }, "fields": { "type": "text" }, "intervalName": { "type": "text" }, "notExpandable": { "type": "boolean" }, "sourceFilters": { "type": "text" }, "timeFieldName": { "type": "text" }, "title": { "type": "text" } } }, "config": { "properties": { "buildNum": { "type": "keyword" } } } } } }
4 Document
Index 里面单条的记录称为 Document(文档)。许多条 Document 构成了一个 Index。Document 使用 JSON 格式表示,同一个 Index 里面的 Document,不要求有相同的结构(scheme),但是最好保持相同,这样有利于提高搜索效率。
- 新增document/全量更新(替换document):指定_id
全量更新: 记录的Id不变,但是版本(version)加1,操作类型(result)从created变成updated,created字段变成false。
其内部是先获取旧数据,然后发送逻辑删除命令,最后再发送新增命令。因为在获取旧数据和es完成新增操作的时间不可靠,所以会增加并发冲突,故不推荐用此方法更新document。
PUT accounts/person/1 { "user": "张三", "title": "工程师", "desc": "数据库管理" } 返回: { "_index": "accounts", "_type": "person", "_id": "1", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "created": true }
- 新增document:不指定_id
POST accounts/person { "user": "张三", "title": "工程师", "desc": "数据库管理" } 返回: { "_index": "accounts", "_type": "person", "_id": "AXCz0GZWNekRbwyPL8i2", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "created": true }
- 查看document
http协议里get请求是不能带request body的,但是ES认为使用GET请求语义更好,所以ES支持GET+request body的方式(大部分服务器和浏览器也都支持),如果遇到不支持的情况,把GET请求更改为POST请求即可。
GET accounts/person/1/?pretty=true //pretty=true表示以易读的格式返回。 返回: { "_index": "accounts", "_type": "person", "_id": "1", "_version": 1, "found": true,//查询成功 "_source": {//原始记录 "user": "张三", "title": "工程师", "desc": "数据库管理" } }
- 删除document
删除操作并非立即物理删除,而是先进行逻辑删除,当存储空间不足等问题发生的时候,才会进行真正的物理删除。
验证方式:
- 新增一个document,指定id为1,其_version为1
- 删除此document,其_version变为2
- 再次新增一个document并且指定id为1,会发现其_version为3
DELETE accounts/person/1 返回: { "found": true, "_index": "accounts", "_type": "person", "_id": "1", "_version": 6, "result": "deleted", "_shards": { "total": 2, "successful": 1, "failed": 0 } } GET accounts/person/1 返回: { "_index": "accounts", "_type": "person", "_id": "1", "found": false }
- 更新document/partial update
部分更新,不需要全部的json数据,_version会+1,result会变为updated
优势:
- 所有的查询,修改,回写都发生在es内部,节省了网络数据的传输开销,提升了性能
- 减少了查询和修改中的时间间隔,有效减少并发冲突
POST /accounts/person/1/_update { "doc": { "user": "张三san", "title": "工程师", "desc": "数据库管理" } } 返回: { "_index": "accounts", "_type": "person", "_id": "1", "_version": 2, "result": "updated", "_shards": { "total": 2, "successful": 1, "failed": 0 } } GET accounts/person/1 返回: { "_index": "accounts", "_type": "person", "_id": "1", "_version": 2, "found": true, "_source": { "user": "张三san", "title": "工程师", "desc": "数据库管理" } }
- 列出所有document
GET accounts/person/_search { "query": { "match_all": {} } } 或者 GET accounts/person/_search 返回: { "took": 1,//操作的耗时(单位为毫秒) "timed_out": false,//是否超时 "_shards": {//所使用到的shard "total": 5, "successful": 5, "failed": 0 }, "hits": {//命中的记录 "total": 2,//返回记录数 "max_score": 1,//最高的匹配程度 "hits": [//返回的记录组成的数组 { "_index": "accounts", "_type": "person", "_id": "AXCz0GZWNekRbwyPL8i2", "_score": 1,//匹配的程序,默认是按照这个字段降序排列 "_source": { "user": "张三", "title": "工程师", "desc": "数据库管理" } }, { "_index": "accounts", "_type": "person", "_id": "1", "_score": 1, "_source": { "user": "张三san", "title": "工程师", "desc": "数据库管理" } } ] } }
5 全文搜索(full text search)
会将输入的字符串进行分词,然后去倒排索引里去一一匹配,只要能匹配上任意一个分词后的词语,就可以作为结果返回
GET accounts/person/_search
{
"from": 0, //指定位移
"size": 1, //设置返回的结果数量
"query": {
"match": {
"user": "san"
}
}
}
返回:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.25316024,
"hits": [
{
"_index": "accounts",
"_type": "person",
"_id": "1",
"_score": 0.25316024,
"_source": {
"user": "张三san",
"title": "工程师",
"desc": "数据库管理"
}
}
]
}
}
6 短语搜索(phrase search)
输入的字符串必须在指定的字段文本中,必须包含一模一样的字符串才算匹配,才能作为结果返回,和全文搜索正好相反
短语搜索
GET accounts/person/_search
{
"query": {
"match": {
"desc": "数据管理"
}
}
}
返回
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
使用全文搜索:
GET accounts/person/_search
{
"query": {
"match": {
"desc": "数据管理"
}
}
}
返回:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 5,
"max_score": 1.1299736,
"hits": [
{
"_index": "accounts",
"_type": "person",
"_id": "5",
"_score": 1.1299736,
"_source": {
"user": "lisi",
"age": 50,
"salary": 6000,
"title": "业务员",
"desc": "数据库管理"
}
},
{
"_index": "accounts",
"_type": "person",
"_id": "1",
"_score": 1.1299736,
"_source": {
"user": "zeng",
"age": 30,
"salary": 20000,
"title": "工程师",
"desc": "数据库管理"
}
},
{
"_index": "accounts",
"_type": "person",
"_id": "3",
"_score": 1.1299736,
"_source": {
"user": "jim",
"age": 35,
"salary": 17000,
"title": "工程师",
"desc": "数据库管理"
}
},
{
"_index": "accounts",
"_type": "person",
"_id": "2",
"_score": 0.71613276,
"_source": {
"user": "sam",
"age": 20,
"salary": 15000,
"title": "工程师",
"desc": "数据库管理"
}
},
{
"_index": "accounts",
"_type": "person",
"_id": "4",
"_score": 0.71613276,
"_source": {
"user": "zhangsan",
"age": 50,
"salary": 5000,
"title": "业务员",
"desc": "数据库管理"
}
}
]
}
}
7 逻辑运算搜索
现有记录:
GET accounts/person/_search
返回:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "accounts",
"_type": "person",
"_id": "AXCz0GZWNekRbwyPL8i2",
"_score": 1,
"_source": {
"user": "张三",
"title": "工程师",
"desc": "数据库管理"
}
},
{
"_index": "accounts",
"_type": "person",
"_id": "2",
"_score": 1,
"_source": {
"user": "王五 ",
"title": "工程师",
"desc": "数据库管理"
}
},
{
"_index": "accounts",
"_type": "person",
"_id": "1",
"_score": 1,
"_source": {
"user": "张三李四",
"title": "工程师",
"desc": "数据库管理"
}
}
]
}
}
and搜索:
GET accounts/person/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"user": "四"
}
},
{
"match": {
"user": "三"
}
}
],
"must_not": [
{"match": {
"user": "五"
}}
]
}
}
}
返回:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.5753642,
"hits": [
{
"_index": "accounts",
"_type": "person",
"_id": "1",
"_score": 0.5753642,
"_source": {
"user": "张三李四",
"title": "工程师",
"desc": "数据库管理"
}
}
]
}
}
or搜索:
GET accounts/person/_search
{
"query": {
"match": {
"user": "张三 李四"
}
}
}
返回:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1.1507283,
"hits": [
{
"_index": "accounts",
"_type": "person",
"_id": "1",
"_score": 1.1507283,
"_source": {
"user": "张三李四",
"title": "工程师",
"desc": "数据库管理"
}
},
{
"_index": "accounts",
"_type": "person",
"_id": "AXCz0GZWNekRbwyPL8i2",
"_score": 0.51623213,
"_source": {
"user": "张三",
"title": "工程师",
"desc": "数据库管理"
}
}
]
}
}
8 高亮搜索
GET accounts/person/_search
{
"query": {
"match": {
"user": "sam"
}
},
"highlight": {
"pre_tags": [//高亮前缀标签
"<a class='highlightClass' href='#'>"
],
"post_tags": [//高亮后缀标签
"</a>"
],
"fields": {//需要高亮的字段
"user":{},
"desc":{}
}
}
}
返回:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.6931472,
"hits": [
{
"_index": "accounts",
"_type": "person",
"_id": "2",
"_score": 0.6931472,
"_source": {
"user": "sam",
"age": 20,
"salary": 15000,
"title": "工程师",
"desc": "数据库管理"
},
"highlight": {
"user": [
"<em>sam</em>"
]
}
}
]
}
}
9 综合示例
现有数据:
GET accounts/person/_search
返回:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 5,
"max_score": 1,
"hits": [
{
"_index": "accounts",
"_type": "person",
"_id": "5",
"_score": 1,
"_source": {
"user": "lisi",
"age": 50,
"salary": 6000,
"title": "业务员",
"desc": "数据库管理"
}
},
{
"_index": "accounts",
"_type": "person",
"_id": "2",
"_score": 1,
"_source": {
"user": "sam",
"age": 20,
"salary": 15000,
"title": "工程师",
"desc": "数据库管理"
}
},
{
"_index": "accounts",
"_type": "person",
"_id": "4",
"_score": 1,
"_source": {
"user": "zhangsan",
"age": 50,
"salary": 5000,
"title": "业务员",
"desc": "数据库管理"
}
},
{
"_index": "accounts",
"_type": "person",
"_id": "1",
"_score": 1,
"_source": {
"user": "zeng",
"age": 30,
"salary": 20000,
"title": "工程师",
"desc": "数据库管理"
}
},
{
"_index": "accounts",
"_type": "person",
"_id": "3",
"_score": 1,
"_source": {
"user": "jim",
"age": 35,
"salary": 17000,
"title": "工程师",
"desc": "数据库管理"
}
}
]
}
}
1 查找出title包含“工”字、salary在16000至20000之间的记录,并且按age倒序,只显示age、title和salary字段
GET accounts/person/_search
{
"query": {
"bool": {
"must": [//查找条件
{"match": {
"title": "工"
}}
],
"filter": {//过滤条件
"range": {
"salary": {
"gte": 16000,
"lte": 200000
}
}
}
}
},
"sort": [//排序条件
{
"age": "desc"
}
],
"_source": ["age","title","salary"]//指定显示字段
}
返回:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": null,
"hits": [
{
"_index": "accounts",
"_type": "person",
"_id": "3",
"_score": null,
"_source": {
"title": "工程师",
"age": 35
},
"sort": [
35
]
},
{
"_index": "accounts",
"_type": "person",
"_id": "1",
"_score": null,
"_source": {
"title": "工程师",
"age": 30
},
"sort": [
30
]
},
{
"_index": "accounts",
"_type": "person",
"_id": "2",
"_score": null,
"_source": {
"title": "工程师",
"age": 20
},
"sort": [
20
]
}
]
}
}
注:搜索的时候可以指定超时时间,在指定的时间内,找到多少条结果就返回多少条结果,如:GET _search?timeout=10ms。单位ms为毫秒数,s为秒数,m为分钟数
-
filter和query的区别
- filter:只是按照搜索条件搜索出来的数据(已计算出相关度,已排好序)进行过滤,对相关度分数没有影响,且自动cache最常用的filter的数据,所以性能最好
- query:会计算每个document相对于搜索条件的相关度,并按照相关度排序,且不会cache结果
-
query的类型
- match: match模糊匹配,先对输入进行分词,对分词后的结果进行查询,文档只要包含match查询条件的一部分就会被返回
- term结构化字段查询,匹配一个值,且输入的值不会被分词器分词。
-
match_phase习语匹配,查询确切的phase,在对查询字段定义了分词器的情况下,会使用分词器对输入进行分词
-
等等
-
constant_score:只使用filter过滤数据
GET /accounts/person/_search
{
"query": {
"constant_score": {
"filter": {
"term": {
"age": 50
}
}
}
}
}
返回:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "accounts",
"_type": "person",
"_id": "5",
"_score": 1,
"_source": {
"user": "lisi",
"age": 50,
"salary": 6000,
"title": "业务员",
"desc": "数据库管理",
"tag": "girl"
}
},
{
"_index": "accounts",
"_type": "person",
"_id": "9",
"_score": 1,
"_source": {
"user": "lisi",
"age": 50,
"salary": 6000,
"title": "业务员",
"desc": "数据库管理"
}
},
{
"_index": "accounts",
"_type": "person",
"_id": "6",
"_score": 1,
"_source": {
"user": "lisi",
"age": 50,
"salary": 6000,
"title": "业务员",
"desc": "数据库管理"
}
}
]
}
}