目录
(1)GET + PUT(手动先GET查出来,再PUT覆盖)
(2)POST + _update (原理还是先GET查出来,再PUT覆盖)
_id 仅仅是一个字符串,它与 _index 和 _type 组合时,就可以在Elasticsearch中唯一标识一个文档。当创建一个文
档,你可以自定义 _id ,也可以让Elasticsearch帮你自动生成(32位长度)
一、索引操作
1、创建索引
body 为可选项,主要是为了设置 mapping 和 settings。setttings设置了分片数、复制节点数(集群),不写的话默认为1个分片,1个复制节点。
(1)不进行配置-非结构化索引
PUT /book
(2)配置settings-非结构化索引
PUT /book
{
"settings":{
"index":{
"number_of_shards": "2", #分片数量
"number_of_replicas": "1" #副本数
}
}
}
(3)配置setting+mapping
PUT /itcast
{
"settings": {
"index": {
"number_of_shards": "1",
"number_of_replicas": "0"
}
},
"mappings": {
"person": {
"properties": {
"name": {
"type": "text"
},
"age": {
"type": "integer"
},
"mail": {
"type": "keyword"
},
"hobby": {
"type": "text",
"analyzer": "ik_max_word"
}
}
}
}
}
2、查看索引结构
GET /book
3、删除索引
DELETE /book
二、数据操作
如果创建索引时未设置 mapping,插入数据时会根据数据类型自动创建 mapping。
1、插入数据
(1)指定 _id 插入数据
POST /book/person/001
{
"id":001,
"name":"张三",
"age":20,
"sex":"男"
}
(2)不指定 _id 插入数据,自动随机生成 _id
POST /book/person
{
"id":001,
"name":"张三",
"age":20,
"sex":"男"
}
2、删除数据
指定 _id 删除数据
DELETE /book/person/001
3、更新数据
(1)GET + PUT(手动先GET查出来,再PUT覆盖)
GET book/person/001
PUT book/person/001
{
"id": 1001,
"name": "张三",
"age": 21,
"sex": "男"
}
(2)POST + _update (原理还是先GET查出来,再PUT覆盖)
POST book/person/001/_update
{
"doc": {
"age": 23
}
}
4、搜索数据
(1)搜索所有数据
GET book/person/_search
(2)指定 _id 搜索数据
GET book/person/001
(3)条件搜索
GET book/person/_search?q=age:21
三、DSL数据操作
1、查询数据
keyword / long 等只能进行精确匹配
text 可以进行模糊匹配,原理是分词器分词成更小的 keyword。
(1)match 匹配
age 是long类型,执行的是精确匹配。
POST /book/person/_search
{
"query" : {
"match" : {
"age" : 21
}
}
}
name 是 text 类型,执行的是模糊匹配。
POST /book/person/_search
{
"query": {
"match": {
"name": "张三 李四"
}
}
}
高亮显示模糊匹配的分词
POST /book/person/_search
{
"query" : {
"match" : {
"name" : "张三李四"
}
},
"highlight": {
"fields": {
"name": {}
}
}
}
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1.3862942,
"hits": [
{
"_index": "book",
"_type": "person",
"_id": "N4BnqnQBnIVZyLbm7GCX",
"_score": 1.3862942,
"_source": {
"id": 1002,
"name": "李四",
"age": 21,
"sex": "女"
},
"highlight": {
"name": [
"<em>李</em><em>四</em>"
]
}
},
{
"_index": "book",
"_type": "person",
"_id": "001",
"_score": 1.3862942,
"_source": {
"id": 1001,
"name": "张三",
"age": 22,
"sex": "男"
},
"highlight": {
"name": [
"<em>张</em><em>三</em>"
]
}
}
]
}
}
(2)filter 过滤
POST /book/person/_search
{
"query" : {
"filter" : {
"range": {
"age" : {
"gt": 20
}
}
}
"must": {
"match": {
"sex": "男"
}
}
}
}
(3)聚合查询
{
"aggs": {
"all_interests": {
"terms": {
"field": "age"
}
}
}
}
{
"took": 142,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 4,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "book",
"_type": "person",
"_id": "N4BnqnQBnIVZyLbm7GCX",
"_score": 1,
"_source": {
"id": 1002,
"name": "李四",
"age": 21,
"sex": "女"
}
},
{
"_index": "book",
"_type": "person",
"_id": "001",
"_score": 1,
"_source": {
"id": 1001,
"name": "张三",
"age": 22,
"sex": "男"
}
},
{
"_index": "book",
"_type": "person",
"_id": "003",
"_score": 1,
"_source": {
"id": 1003,
"name": "王五",
"age": 22,
"sex": "男"
}
},
{
"_index": "book",
"_type": "person",
"_id": "004",
"_score": 1,
"_source": {
"id": 1004,
"name": "赵六",
"age": 23,
"sex": "女"
}
}
]
},
"aggregations": {
"all_interests": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 22,
"doc_count": 2
},
{
"key": 21,
"doc_count": 1
},
{
"key": 23,
"doc_count": 1
}
]
}
}
}
(4)指定返回字段
GET /haoke/user/001?_source=id,name
{
"_index": "book",
"_type": "person",
"_id": "001",
"_version": 6,
"_seq_no": 10,
"_primary_term": 1,
"found": true,
"_source": {
"name": "张三",
"id": 1001
}
}
(5)仅返回原始数据
/book/person/001/_source
{
"id": 1001,
"name": "张三",
"age": 22,
"sex": "男"
}
GET /book/person/001/_source?_source=id,name
{
"name": "张三",
"id": 1001
}
四、批量操作
1、批量查询
POST /book/person/_mget
{
"ids":["001", "002"]
}
2、批量插入
最后要多出一行
POST /book/person/_bulk
{"create":{"_index":"book","_type":"person","_id":2001}}
{"id":2001,"name":"name1","age": 20,"sex": "男"}
{"create":{"_index":"book","_type":"person","_id":2002}}
{"id":2002,"name":"name2","age": 20,"sex": "男"}
{"create":{"_index":"book","_type":"person","_id":2003}}
{"id":2003,"name":"name3","age": 20,"sex": "男"}
3、批量删除
最后要多出一行
POST /book/person/_bulk
{"delete":{"_index":"book","_type":"person","_id":2001}}
{"delete":{"_index":"book","_type":"person","_id":2002}}
{"delete":{"_index":"book","_type":"person","_id":2003}}
一次请求多少性能最高?
整个批量请求需要被加载到接受我们请求节点的内存里,所以请求越大,给其它请求可用的内存就越小。有一
个最佳的bulk请求大小。超过这个大小,性能不再提升而且可能降低。
最佳大小,当然并不是一个固定的数字。它完全取决于你的硬件、你文档的大小和复杂度以及索引和搜索的负
载。
幸运的是,这个最佳点(sweetspot)还是容易找到的:试着批量索引标准的文档,随着大小的增长,当性能开始
降低,说明你每个批次的大小太大了。开始的数量可以在1000~5000个文档之间,如果你的文档非常大,可以
使用较小的批次。
通常着眼于你请求批次的物理大小是非常有用的。一千个1kB的文档和一千个1MB的文档大不相同。一个好的
批次最好保持在5-15MB大小间。
五、组合查询
1、bool层
bool:查询可以用来合并多个条件查询结果的布尔逻辑,它包含一下操作符
2、关系层
filter:对查询结果做缓存,主要配合是 filter + term。
must:多个查询条件的完全匹配,相当于 and 。
must not:多个查询条件的相反匹配,相当于 not 。
should:至少有一个查询条件匹配, 相当于 or 。
3、查询层
term:精确匹配 keyword
match:精确匹配 keyword + 模糊匹配 text
{
"query": {
"bool": {
"must": {
"match": {
"age": "22"
}
},
"must_not": []
}
}
}
{
"query": {
"bool": {
"must": [
{
"match": {
"age": "22"
}
},
{
"match": {
"sex": "男"
}
}
],
"must_not": []
}
}
}
{
"query": {
"bool": {
"must": [
{
"term": {
"age": "22"
}
},
{
"match": {
"name": "张三"
}
}
],
"must_not": []
}
}
}
{
"query": {
"bool": {
"filter": [
{
"term": {
"age": "22"
}
},
{
"match": {
"sex": "男"
}
}
],
"must": [],
"must_not": [],
"should": []
}
}
}
六、指定分片查询
可以使用 preference=_shards:0 指定分片,0 代表编号为 0 的分片。
GET /book/person/_search?preference=_shards:0
七、upsert 操作
更新或插入。
curl -XPOST 'localhost:9200/book/person/001/_update' -d '{
"doc" : {
"age" : 23
},
"upsert" : {
"counter" : 1
}
}'
curl -XPOST 'localhost:9200/book/person/008/_update' -d '{
"doc" : {
"age" : 23
},
"upsert" : {
"counter" : 1
}
}'
八、multi_match
multi_match
查询为能在多个字段上反复执行相同查询提供了一种便捷方式。
多个字段
GET /_search
{
"query": {
"multi_match" : {
"query": "this is a test",
"fields": [ "subject", "message" ]
}
}
}
多个字段,包含模糊字段
GET /_search
{
"query": {
"multi_match" : {
"query": "Will Smith",
"fields": [ "title", "*_name" ]
}
}
}
多个字段,包含模糊字段,加权重
GET /_search
{
"query": {
"multi_match" : {
"query" : "this is a test",
"fields" : [ "subject^3", "message" ]
}
}
}
The best_fields type generates a match query for each field and wraps them in a dis_max query, to find the single best matching field. For instance, this query:
GET /_search
{
"query": {
"multi_match" : {
"query": "brown fox",
"type": "best_fields",
"fields": [ "subject", "message" ],
"tie_breaker": 0.3
}
}
}
would be executed as:
GET /_search
{
"query": {
"dis_max": {
"queries": [
{ "match": { "subject": "brown fox" }},
{ "match": { "message": "brown fox" }}
],
"tie_breaker": 0.3
}
}
}