[ECE]模拟试题-1

该博客介绍了ECE相关的一系列索引操作和查询任务,包括索引重建,如task2重建为new_task2,使得match匹配the不再返回数据。此外,讨论了task3索引重建,新增fieldg字段,内容为其他字段值的拼接。还涉及地震数据按月聚合,跨集群查询,多字段的multi_match查询,以及使用runtime字段进行聚合分析等高级查询技巧。最后,提出了一道关于飞机延误时间的分析问题,展示了如何按月统计平均延误时间最长的航空公司和目的地国家。
摘要由CSDN通过智能技术生成
  1. 有一个索引task2,有field2字段,用match匹配the能查到很多数据,现在要求对task2索引进行重建,重建后的索引叫new_task2,然后match匹配the查不到数据

Text analysis› Token filter reference> stop

DELETE /task2
DELETE /new_task2
PUT task2
{
  "settings": {
    "number_of_replicas": 0
  },
  "mappings": {
    "properties": {
      "field2":{
        "type": "text"
      }
    }
  }
}

PUT task2/_doc/1
{
  "field2":"the school"
}

PUT /new_task2
{
  "settings": {
    "number_of_replicas": 0,
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "standard",
          "filter": [
            "stop"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "field2": {
        "type": "text",
        "analyzer": "my_analyzer"
      }
    }
  }
}

POST /_reindex
{
  "source": {
    "index": "task2"
  },
  "dest": {
    "index": "new_task2"
  }
}

GET /new_task2/_search
{
  "query": {
    "match": {
      "field2": "the"
    }
  }
}
  1. 有一个索引task3,其中有fielda,fieldb,fieldc,fielde,现要求对task3重建索引,重建后的索引新增一个字段fieldg,其值是fiedla,fieldb,fieldc,fielde的值拼接而成。

Ingest pipelines

Ingest pipelines› Ingest processor reference

DELETE task3
PUT task3
{
  "mappings": {
    "properties": {
      "fielda":{
        "type": "keyword"
      },
      "fieldb":{
        "type": "keyword"
      },
      "fieldc":{
        "type": "keyword"
      },
      "fielde":{
        "type": "keyword"
      }
    }
  }
}

POST task3/_doc/1
{
  "fielda":"aa",
  "fieldb":"bb",
  "fieldc":"cc",
  "fielde":"dd"
}
//可以使用 _simulate 去测试写的 pipeline
POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "processors": [
      {
        "lowercase": {
          "field": "my-keyword-field"
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "my-keyword-field": "FOO"
      }
    },
    {
      "_source": {
        "my-keyword-field": "BAR"
      }
    }
  ]
}
PUT task3_new
{
  "mappings": {
    "properties": {
      "fielda":{
        "type": "keyword"
      },
      "fieldb":{
        "type": "keyword"
      },
      "fieldc":{
        "type": "keyword"
      },
      "fielde":{
        "type": "keyword"
      },
      "fieldg":{
        "type": "keyword"
      }
    }
  }
}

PUT _ingest/pipeline/my_exam1_pipeline
{
  "processors": [
    {
      "script": {
        "source": "ctx.fieldg = ctx.fielda + ctx.fieldb + ctx.fieldc + ctx.fielde"
      }
    }
  ]
}

POST /_reindex
{
  "source": {
    "index": "task3"
  },
  "dest": {
    "index": "task3_new",
    "pipeline": "my_exam1_pipeline"
  }
}

GET task3_new/_search
  1. 地震索引,只要2012年的数据(日期格式dd/MM/yyyyTHH:mm:ss),按月分桶,然后对每个桶里对magnitude和depth进行最大值聚合

Aggregations› Bucket aggregations

DELETE /earthquakes2
PUT earthquakes2
{
  "settings": {
    "number_of_replicas": 0
  },
  "mappings": {
    "properties": {
      "timestamp":{
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss"
      },
      "magnitude":{
        "type": "float"
      },
	  "type":{
	    "type":"integer"
	  },
	  "depth":{
	    "type":"float"
	  }
    }
  }
}

POST earthquakes2/_bulk
{"index":{"_id":1}}
{"timestamp":"2012-01-01 12:12:12", "magnitude":4.56, "type":1, "depth":10}
{"index":{"_id":2}}
{"timestamp":"2012-01-01 15:12:12", "magnitude":6.46, "type":2, "depth":11}
{"index":{"_id":3}}
{"timestamp":"2012-02-02 13:12:12", "magnitude":4, "type":2, "depth":5}
{"index":{"_id":4}}
{"timestamp":"2012-03-02 13:12:12", "magnitude":6, "type":3, "depth":8}
{"index":{"_id":5}}
{"timestamp":"1967-03-02 13:12:12", "magnitude":6, "type":2, "depth":6}

POST /earthquakes2/_search
{
  "size": 0,
  "aggs": {
    "my_filter": {
      "filter": {
        "range": {
          "timestamp": {
            "gte": "2012-01-01 00:00:00",
            "lte": "2013-01-01 00:00:00"
          }
        }
      },
      "aggs": {
        "bucket_month": {
          "date_histogram": {
            "field": "timestamp",
            "calendar_interval": "month"
          },
          "aggs": {
            "max_magnitude": {
              "max": {
                "field": "magnitude"
              }
            },
            "max_depth":{
              "max": {
                "field": "depth"
              }
            }
          }
        }
      }
    }
  }
}
  1. 注册一个快照,在指定的库创建一个指定索引的快照。

Snapshot and restore› Register a snapshot repository

PUT /_snapshot/my_backup
{
  "type": "fs",
  "settings": {
    "location": "/usr/share/elasticsearch/snapshot"
  }, 
  "max_snapshot_bytes_per_sec":"20mb",
  "max_restore_bytes_per_sec":"20mb"
}

//备份索引
PUT /_snapshot/my_backup/snapshot_test_2023-01-03
{
  "indices": ["test"],
  "ignore_unavailable": true,
  "include_global_state": false
}

  1. 跨集群查询。
    1)配置好跨集群设置
    2)配置clustername:索引名即可。

Set up Elasticsearch› Remote clusters

PUT /_cluster/settings
{
  "persistent": {
    "cluster": {
      "remote":{
        "cluster_one":{
          "seeds":[
            "192.168.0.11:9300"
            ]
        }
      }
    }
  }
}

POST /cluster_one:employees/_search
{
  "query": {
    "match_all": {}
  }
}
  1. multi_match查询。在a,b,c,d字段里搜索"fire",D字段的权重为2,最终得分是所有命中字段得分的和。

Query DSL› Full text queries

POST task5/_bulk
{"index":{"_id":1}}
{"a":"fire", "b":"fired", "c":"fox", "d":"box"}
POST task5/_search
{
  "query": {
    "multi_match": {
      "query": "fire",
      "type": "most_fields", 
      "fields": ["a","b","c","d^2"]
    }
  }
}
  1. 运行时字段,根据运行字段进行聚合分析,根据文档书写即可。

    在task6索引里,创建一个runtime字段,其值是A-B,A,B为字段;创建一个range聚合,分为三级:小于0,0-100,100以上;返回文档数为0

Mapping› Runtime fields

PUT task6/_bulk
{"index":{"_id":1}}
{"A":100, "B":2}
{"index":{"_id":2}}
{"A":120, "B":2}
{"index":{"_id":3}}
{"A":120, "B":25}
{"index":{"_id":4}}
{"A":21, "B":25}

PUT task6/_mapping
{
  
  "runtime":{
    "C":{
      "type":"long",
      "script":{
        "source":"emit(doc['A'].value - doc['B'].value)"
      }
    }
  }
}


POST task6/_search
{
  "size": 0,
  "aggs": {
    "range_c": {
      "range": {
        "field": "C",
        "ranges": [
          {
            "to":0
          },
          {
            "from": 0,
            "to": 100
          },
          {
            "from": 100
          }
        ]
      }
    }
  }
}
  1. 检索模板、查询,高亮,排序的混合。创建一个搜索模板满足以下条件:
    对于字段A,搜索param为search_string
    对于返回值,要高亮A字段的内容,用和框起来
    对于返回值,按照B字段排序
    对test索引进行搜索,search_string的值为test

Search your data

DELETE test_search_temp
PUT test_search_temp
{
  "settings":{
    "number_of_replicas":0,
    "number_of_shards":1
  },
  "mappings":{
    "properties": {
      "A":{
        "type":"text"
      },
      "B":{
        "type":"integer"
      }
    }
  }
}

POST test_search_temp/_bulk
{"index":{"_id":1}}
{"A":"I love test", "B":1}
{"index":{"_id":2}}
{"A":"I hate test", "B":2}


PUT _scripts/my-search-template
{
  "script": {
    "lang": "mustache",
    "source": {
      "query": {
        "match": {
          "A": "{{search_string}}"
        }
      },
      "highlight": {
        "fields": {
          "A": {
            "pre_tags": [
              "<em>"
            ],
            "post_tags": [
              "</em>"
            ]
          }
        }
      },
      "sort": [
        {
          "B": {
            "order": "desc"
          }
        }
      ]
    }
  }
}

GET _scripts/my-search-template


GET test_search_temp/_search/template
{
  "id": "my-search-template",
  "params": {
    "search_string": "test"
  }
}

  1. 给了飞机每个月飞行的数据,求出每个月平均延误时间最长的公司名字。
    ● 按到达国家统计每个国家的平均机票价格,并找出平均价格最高的国家。因为没有数据与具体题目,只能按要求做一个类似的。
    ● 求出每个月平均延误时间最长的目的地国家
    ● 解析:首先按月(timestamp)分桶,第二按目的地国家(DestCountry)分桶,第三求出每个目的地国家平均延误时间,第四因max_bucket是sibling聚合,在与bucket_DestCountry目的地国家桶并列的求出每个月内平均延误时间最长的国家;
    ● 结论:每个月内,都有一个最大延误时间的目的地国家

GET /kibana_sample_data_flights/_search
{
  "size": 0,
  "aggs": {
    "bucket_DestCountry": {
      "terms": {
        "field": "DestCountry",
        "size": 10
      },
      "aggs": {
        "avg_price": {
          "avg": {
            "field": "AvgTicketPrice"
          }
        }
      }
    },
    "max_price_country":{
      "max_bucket": {
        "buckets_path": "bucket_DestCountry>avg_price"
      }
    }
  }
}

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

df007df

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值