ElasticSearch常规操作

最新推荐文章于 2024-10-18 14:27:26 发布

moxiaoran5753

最新推荐文章于 2024-10-18 14:27:26 发布

阅读量32

点赞数

文章标签： elasticsearch jenkins 大数据

原文链接：https://www.cnblogs.com/slothhh/p/16345176.html

版权

基本操作

1、_cat

Get/_cat/nodes //查看节点

Get/_cat/health //查看健康状况

Get/_cat/master //查看主节点

Get/_cat/indices //查看索引

2、索引一个文档（保存）

PUT/customer/external/1 ，发送多次是更新操作

POST/customer/external/1 ，发送多次是更新操作

POST/customer/external，自动生成id

2 、索引一个文档（保存）

保存一个数据，保存在哪个索引的哪个类型下，指定用哪个唯一标识，

在 customer 索引下的 external 类型下保存 1 号数据为

PUT customer/external/1{

"name": "John Doe"

}

PUT 和 POST 都可以，

POST 新增。如果不指定 id ，会自动生成 id 。指定 id 就会修改这个数据，并新增版本号

PUT 可以新增也可以修改。 PUT 必须指定 id ；由于 PUT 需要指定 id ，我们一般都用来做修改

操作，不指定 id 会报错。

响应数据

{
"_index": "customer",
"_type": "external",
"_id": "1",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,//并发控制字段，每次更新就会+1，用来做乐观锁
"_primary_term": 1, //同上，主分片重新分配，如重启，就会变化
}

通过_seq_no和_primary_term的值进行乐观锁操作，操作失败时会有如下error提示

PUT/customer/external/1?if_seq_no=7&if_primary_term=1

{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "request [/customer/external/1] contains unrecognized parameters: [if_primary_term], [if_seq_no]"
}
],
"type": "illegal_argument_exception",
"reason": "request [/customer/external/1] contains unrecognized parameters: [if_primary_term], [if_seq_no]"
},
"status": 400
}

3、查询文档

GET/customer/external/1

{
"_index": "customer",
"_type": "external",
"_id": "1", //Id号
"_version": 1, //版本号
"_seq_no": 0, //并发控制字段，每次更新就会+1，用来做乐观锁
"_primary_term": 1, //同上，主分片重新分配，如重启，就会变化
"found": true,
"_source": { //真正的内容
"name": "John Doe"
}
}

4、更新文档

POST/customer/external/1/_update 该语句会对比数据内容，数据一样则不会进行update操作，版本号不会更新

{
"_index": "customer",
"_type": "external",
"_id": "1",
"_version": 2,
"result": "noop", //检测到数据一致，无操作执行
"_shards": {
"total": 0,
"successful": 0,
"failed": 0
},
"_seq_no": 1,
"_primary_term": 1
}

不带_update的语句会进行新增或更新操作

5、删除文档

DEL/customer/external/1 删除文档

DEL/customer 删除索引

6、批量API

bulk批量API操作每个语句是独立的，其中一条出现错误，整个语句不会进行回滚

使用Kibana的Dev Tool进行测试

命令：
POST /customer/external/_bulk
{"index":{"_id":"1"}}
{"name":"tang"}
{"index":{"_id":"2"}}
{"name":"yao"}
结果：
#! Deprecation: [types removal] Specifying types in bulk requests is deprecated.
{
"took" : 501,
"errors" : false,
"items" : [
{
"index" : {
"_index" : "customer",
"_type" : "external",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 3,
"_primary_term" : 1,
"status" : 201
}
},
{
"index" : {
"_index" : "customer",
"_type" : "external",
"_id" : "2",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 4,
"_primary_term" : 1,
"status" : 201
}
}
]
}

复杂的批量API操作

POST /_bulk
{"delete":{"_index":"website","_type":"blog","_id":"123"}}
{"create":{"_index":"website","_type":"blog","_id":"123"}}
{"title":"My first blog post"}
{"index":{"_index":"website","_type":"blog"}}
{"title":"My second blog post"}
{"update":{"_index":"website","_type":"blog","_id":"123"}}
{"doc":{"title":"My updated blog post"}}

官方测试数据：elasticsearch/accounts.json at v7.4.2 · elastic/elasticsearch (github.com)

四、进阶操作

1、SearchAPI（检索信息）

1.1、检索bank下所有信息

GET bank/_search

1.2、请求参数方式检索

//?检索条件,q=* 查询所有，sort=account_number:asc排序规则按照该字段升序排列
GET bank/_search?q=*&sort=account_number:asc

1.3、请求体+URL检索即QueryDSL检索

GET /bank/_search
{
"query": {
"match_all": {}
},
"sort": [
{
"balance":"desc"
},
{
"account_number": "asc"
}
]
}

2、QueryDSL

请求体+URL检索的方式进行检索

2.1、基本语法

GET /bank/_search
{
"query": {
"match_all": {}
},
"sort": [
{
"balance":"desc"
},
{
"account_number": "asc"
}
],
"from": 0,
"size": 5,
"_source": ["balance","firstname"]
}

2.2、match 匹配查询

##match 全文检索按照评分进行排序，会对检索条件进行分词匹配
GET /bank/_search
{
"query": {
"match": {
"account_number": "20"
}
}
}

GET /bank/_search
{
"query": {
"match": {
"address": "mill lane"
}
}
}

2.3、match_phrase

##短语匹配
GET /bank/_search
{
"query":{
"match_phrase":{
"address":"mill lane"
}
}
}

2.4、multi_match 多字段匹配

GET /bank/_search
{
"query": {
"multi_match": {
"query": "mill",
"fields": ["address","city"]
}
}
}

2.5、bool 合并查询

GET /bank/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"age": "40"
}
}
],
"must_not": [
{
"match": {
"state": "ID"
}
}
],
"should": [
{
"match": {
"lastname": "Ross"
}
}
]
}
}
}

2.6、filter 结果过滤

GET /bank/_search
{
"query": {
"bool": {
"must": { "match_all": {} },
"filter": {
"range": {
"balance": {
"gte": 20000,
"lte": 30000
}
}
}
}
}
}

2.7、term

和match一样，匹配某个属性的值。全文检索字段用match，非text字段用term

##term 非TEXT字段
GET bank/_search
{
"query":{
"term":{
"balance":"32838"
}
}
}
GET /_search
{
"query": {
"match_phrase": {
"address": "789 Madison"
}
}
}
##精准查询
GET /_search
{
"query": {
"match": {
"address.keyword": "789 Madison"
}
}
}

2.8、aggregations 聚合执行

GET bank/_search
{
"query": {
"match": {
"address": "mill"
}
},
"aggs": {
"ageaggs": {
"terms": {
"field": "age",
"size": 10 //假设年龄有100种可能，只取出前10种可能
}
}
}
}

复杂：

# 按照年龄聚合，并且请求这些年龄段的这些人的平均薪资
GET bank/_search
{
"query": {
"match_all": {}
},
"aggs": {
"ageAgg": {
"terms": {
"field": "age",
"size": 100
},
"aggs": {
"balanceAvg": {
"avg": {
"field": "balance"
}
}
}
}
}
}

# 查出所有的年龄分布，并且这些年龄段中性别为M的平均薪资和F的平均薪资以及这个年龄段的总体平均薪资

# 查出所有的年龄分布，并且这些年龄段中性别为M的平均薪资和F的平均薪资以及这个年龄段的总体平均薪资

GET bank/_search
{
"query": {
"match_all": {}
},
"aggs": {
"ageAgg": {
"terms": {
"field": "age",
"size": 100
},
"aggs": {
"gender": {
"terms": {
"field": "gender.keyword"
},
"aggs": {
"genderBalance": {
"avg": {
"field": "balance"
}
}
}
},
"ageBlance":{
"avg": {
"field": "balance"
}
}

}

}
},
"size": 0
}

3、Mapping

3.1 字段类型

参考文档 Field datatypes | Elasticsearch Guide [7.5] | Elastic

3.2 映射

Mapping（映射）用来定义一个文档（document），以及他所包含属性（feild）是如何存储和索引的。

查看mapping信息 GET bank/_mapping

每个属性的映射类型,type为text默认就会就全文检索，检索起来就会分词，想要精确检索address的值，就要用address.keyword

3.3 创建映射规则

PUT /my_index
{
"mappings": {
"properties": {
"age": { "type": "integer" },
"email": { "type": "keyword" },
"name": { "type": "text" }
}
}
}

GET my_index/_mapping

3.4 添加新的字段映射

PUT /my_index/_mapping
{
"properties": {
"employee-id": {
"type": "keyword",
"index": false // 设置不可以被索引
}
}
}

3.5 更新映射

对于已经存在的映射字段，我们不能更新。更新必须创建新的索引进行数据迁移

3.6 数据迁移

根据bank的属性，复制过来进行修改而生成新的索引和映射规则

PUT /newbank
{
"mappings": {
"properties" : {
"account_number" : {
"type" : "long"
},
"address" : {
"type" : "text"
},
"age" : {
"type" : "integer"
},
"balance" : {
"type" : "long"
},
"city" : {
"type" : "keyword"
},
"email" : {
"type" : "keyword"
},
"employer" : {
"type" : "keyword"

},
"firstname" : {
"type" : "text"
},
"gender" : {
"type" : "keyword"
},
"lastname" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"state" : {
"type" : "keyword"

}
}
}
}