ES 基本操作

最新推荐文章于 2023-09-14 18:13:32 发布

小猪快点跑

最新推荐文章于 2023-09-14 18:13:32 发布

阅读量1.1k

点赞数

分类专栏： ES

本文链接：https://blog.csdn.net/weixin_41565755/article/details/108695396

版权

ES 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

一、索引操作

1、创建索引

body 为可选项，主要是为了设置 mapping 和 settings。setttings设置了分片数、复制节点数（集群），不写的话默认为1个分片，1个复制节点。

（1）不进行配置-非结构化索引

PUT /book

（2）配置settings-非结构化索引

PUT /book
{
  "settings":{
    "index":{
        "number_of_shards": "2",    #分片数量
        "number_of_replicas": "1"   #副本数
    }
  }
}

（3）配置setting+mapping

PUT /itcast

{
    "settings": {
        "index": {
            "number_of_shards": "1",
            "number_of_replicas": "0"
        }
    },
    "mappings": {
        "person": {
            "properties": {
                "name": {
                    "type": "text"
                },
                "age": {
                    "type": "integer"
                },
                "mail": {
                    "type": "keyword"
                },
                "hobby": {
                    "type": "text",
                    "analyzer": "ik_max_word"
                }
            }
        }
    }
}

2、查看索引结构

GET /book

3、删除索引

DELETE /book

二、数据操作

如果创建索引时未设置 mapping，插入数据时会根据数据类型自动创建 mapping。

1、插入数据

（1）指定 _id 插入数据

POST /book/person/001
{
    "id":001,
    "name":"张三",
    "age":20,
    "sex":"男"
}

（2）不指定 _id 插入数据，自动随机生成 _id

POST /book/person
{
    "id":001,
    "name":"张三",
    "age":20,
    "sex":"男"
}

2、删除数据

指定 _id 删除数据

DELETE /book/person/001

3、更新数据

（1）GET + PUT（手动先GET查出来，再PUT覆盖）

GET book/person/001

PUT book/person/001
{
    "id": 1001,
    "name": "张三",
    "age": 21,
    "sex": "男"
}

（2）POST + _update （原理还是先GET查出来，再PUT覆盖）

POST book/person/001/_update
{
    "doc": {
        "age": 23
    }
}

4、搜索数据

（1）搜索所有数据

GET book/person/_search

（2）指定 _id 搜索数据

GET book/person/001

（3）条件搜索

GET book/person/_search?q=age:21

三、DSL数据操作

1、查询数据

keyword / long 等只能进行精确匹配

text 可以进行模糊匹配，原理是分词器分词成更小的 keyword。

（1）match 匹配

age 是long类型，执行的是精确匹配。

POST /book/person/_search
{
    "query" : {
        "match" : {
            "age" : 21
        }
    }
}

name 是 text 类型，执行的是模糊匹配。

POST /book/person/_search
{
    "query": {
        "match": {
            "name": "张三 李四"
        }
    }
}

高亮显示模糊匹配的分词

POST /book/person/_search
{
    "query" : {
        "match" : {
            "name" : "张三李四"
        }
    },
    "highlight": {
        "fields": {
            "name": {}
        }
    }
}

{
	"took": 5,
	"timed_out": false,
	"_shards": {
		"total": 1,
		"successful": 1,
		"skipped": 0,
		"failed": 0
	},
	"hits": {
		"total": {
			"value": 2,
			"relation": "eq"
		},
		"max_score": 1.3862942,
		"hits": [
			{
				"_index": "book",
				"_type": "person",
				"_id": "N4BnqnQBnIVZyLbm7GCX",
				"_score": 1.3862942,
				"_source": {
					"id": 1002,
					"name": "李四",
					"age": 21,
					"sex": "女"
				},
				"highlight": {
					"name": [
						"<em>李</em><em>四</em>"
					]
				}
			},
			{
				"_index": "book",
				"_type": "person",
				"_id": "001",
				"_score": 1.3862942,
				"_source": {
					"id": 1001,
					"name": "张三",
					"age": 22,
					"sex": "男"
				},
				"highlight": {
					"name": [
						"<em>张</em><em>三</em>"
					]
				}
			}
		]
	}
}

（2）filter 过滤

POST /book/person/_search
{
    "query" : {
        "filter" : {
            "range": {
                "age" : {
                    "gt": 20
                }                
            }
        }
        "must": {
            "match": {
                "sex": "男"
            }
        }
    }
}

（3）聚合查询

{
    "aggs": {
        "all_interests": {
            "terms": {
                "field": "age"
            }
        }
    }
}

{
	"took": 142,
	"timed_out": false,
	"_shards": {
		"total": 1,
		"successful": 1,
		"skipped": 0,
		"failed": 0
	},
	"hits": {
		"total": {
			"value": 4,
			"relation": "eq"
		},
		"max_score": 1,
		"hits": [
			{
				"_index": "book",
				"_type": "person",
				"_id": "N4BnqnQBnIVZyLbm7GCX",
				"_score": 1,
				"_source": {
					"id": 1002,
					"name": "李四",
					"age": 21,
					"sex": "女"
				}
			},
			{
				"_index": "book",
				"_type": "person",
				"_id": "001",
				"_score": 1,
				"_source": {
					"id": 1001,
					"name": "张三",
					"age": 22,
					"sex": "男"
				}
			},
			{
				"_index": "book",
				"_type": "person",
				"_id": "003",
				"_score": 1,
				"_source": {
					"id": 1003,
					"name": "王五",
					"age": 22,
					"sex": "男"
				}
			},
			{
				"_index": "book",
				"_type": "person",
				"_id": "004",
				"_score": 1,
				"_source": {
					"id": 1004,
					"name": "赵六",
					"age": 23,
					"sex": "女"
				}
			}
		]
	},
	"aggregations": {
		"all_interests": {
			"doc_count_error_upper_bound": 0,
			"sum_other_doc_count": 0,
			"buckets": [
				{
					"key": 22,
					"doc_count": 2
				},
				{
					"key": 21,
					"doc_count": 1
				},
				{
					"key": 23,
					"doc_count": 1
				}
			]
		}
	}
}

（4）指定返回字段

GET /haoke/user/001?_source=id,name

{
	"_index": "book",
	"_type": "person",
	"_id": "001",
	"_version": 6,
	"_seq_no": 10,
	"_primary_term": 1,
	"found": true,
	"_source": {
		"name": "张三",
		"id": 1001
	}
}

（5）仅返回原始数据

/book/person/001/_source

{
	"id": 1001,
	"name": "张三",
	"age": 22,
	"sex": "男"
}

GET /book/person/001/_source?_source=id,name

{
	"name": "张三",
	"id": 1001
}

四、批量操作

1、批量查询

POST /book/person/_mget

{
    "ids":["001", "002"]
}

2、批量插入

最后要多出一行

POST /book/person/_bulk

{"create":{"_index":"book","_type":"person","_id":2001}}
{"id":2001,"name":"name1","age": 20,"sex": "男"}
{"create":{"_index":"book","_type":"person","_id":2002}}
{"id":2002,"name":"name2","age": 20,"sex": "男"}
{"create":{"_index":"book","_type":"person","_id":2003}}
{"id":2003,"name":"name3","age": 20,"sex": "男"}

3、批量删除

最后要多出一行

POST /book/person/_bulk

{"delete":{"_index":"book","_type":"person","_id":2001}}
{"delete":{"_index":"book","_type":"person","_id":2002}}
{"delete":{"_index":"book","_type":"person","_id":2003}}

一次请求多少性能最高？
整个批量请求需要被加载到接受我们请求节点的内存里，所以请求越大，给其它请求可用的内存就越小。有一
个最佳的bulk请求大小。超过这个大小，性能不再提升而且可能降低。
最佳大小，当然并不是一个固定的数字。它完全取决于你的硬件、你文档的大小和复杂度以及索引和搜索的负
载。
幸运的是，这个最佳点(sweetspot)还是容易找到的：试着批量索引标准的文档，随着大小的增长，当性能开始
降低，说明你每个批次的大小太大了。开始的数量可以在1000~5000个文档之间，如果你的文档非常大，可以
使用较小的批次。
通常着眼于你请求批次的物理大小是非常有用的。一千个1kB的文档和一千个1MB的文档大不相同。一个好的
批次最好保持在5-15MB大小间。

五、组合查询

1、bool层

bool：查询可以用来合并多个条件查询结果的布尔逻辑，它包含一下操作符

2、关系层

filter：对查询结果做缓存，主要配合是 filter + term。

must：多个查询条件的完全匹配,相当于 and 。

must not：多个查询条件的相反匹配，相当于 not 。

should：至少有一个查询条件匹配, 相当于 or 。

3、查询层

term：精确匹配 keyword

match：精确匹配 keyword + 模糊匹配 text

{
	"query": {
		"bool": {
			"must": {
    				"match": {
    					"age": "22"
    				}
    		},
			"must_not": []
		}
	}
}

{
	"query": {
		"bool": {
			"must": [
				{
					"match": {
						"age": "22"
					}
				},
				{
					"match": {
						"sex": "男"
					}
				}
			],
			"must_not": []
		}
	}
}

{
	"query": {
		"bool": {
			"must": [
				{
					"term": {
						"age": "22"
					}
				},
				{
					"match": {
						"name": "张三"
					}
				}
			],
			"must_not": []
		}
	}
}

{
	"query": {
		"bool": {
			"filter": [
				{
					"term": {
						"age": "22"
					}
				},
				{
					"match": {
						"sex": "男"
					}
				}
			],
			"must": [],
			"must_not": [],
			"should": []
		}
	}
}

六、指定分片查询

可以使用 preference=_shards:0 指定分片，0 代表编号为 0 的分片。

GET /book/person/_search?preference=_shards:0

七、upsert 操作

更新或插入。

curl -XPOST 'localhost:9200/book/person/001/_update' -d '{
    "doc" : {
        "age" : 23
    },
    "upsert" : {
        "counter" : 1
    }
}'


curl -XPOST 'localhost:9200/book/person/008/_update' -d '{
    "doc" : {
        "age" : 23
    },
    "upsert" : {
        "counter" : 1
    }
}'

八、`multi_match`

multi_match 查询为能在多个字段上反复执行相同查询提供了一种便捷方式。

多个字段

GET /_search
{
  "query": {
    "multi_match" : {
      "query":    "this is a test", 
      "fields": [ "subject", "message" ] 
    }
  }
}

多个字段，包含模糊字段

GET /_search
{
  "query": {
    "multi_match" : {
      "query":    "Will Smith",
      "fields": [ "title", "*_name" ] 
    }
  }
}

多个字段，包含模糊字段，加权重

GET /_search
{
  "query": {
    "multi_match" : {
      "query" : "this is a test",
      "fields" : [ "subject^3", "message" ] 
    }
  }
}

The best_fields type generates a match query for each field and wraps them in a dis_max query, to find the single best matching field. For instance, this query:

GET /_search
{
  "query": {
    "multi_match" : {
      "query":      "brown fox",
      "type":       "best_fields",
      "fields":     [ "subject", "message" ],
      "tie_breaker": 0.3
    }
  }
}

would be executed as:

GET /_search
{
  "query": {
    "dis_max": {
      "queries": [
        { "match": { "subject": "brown fox" }},
        { "match": { "message": "brown fox" }}
      ],
      "tie_breaker": 0.3
    }
  }
}

小猪快点跑

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
ES 基本操作

目录一、索引1、创建索引2、查看索引结构3、删除索引二、数据1、插入数据2、删除数据3、更新数据4、搜索数据一、索引1、创建索引 body 为可选项，主要是为了设置 mapping 和 settings。setttings设置了分片数、复制节点数（集群），不写的话默认为1个分片，1个复制节点。（1）不进行配置-非结构化索引PUT /book（2）配置settings-非结构化索引PUT /book{ "settings":...
复制链接

扫一扫