ELK API常用接口与增删改查/ELK数据迁移

@王先生1

已于 2024-07-03 17:21:35 修改

阅读量1k

点赞数

分类专栏： elk 文章标签： elk elasticsearch 大数据

于 2023-02-26 21:39:18 首次发布

本文链接：https://blog.csdn.net/qq_44637753/article/details/129232018

版权

elk 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

ELK API常用接口

五：elasticdump Examples:
六、elastic调整最大分片数
限制内存elastic、logstash
七：esm迁移

一：Elasticsearch API接口：

1、Cluster API：集群API，用于获取、修改和管理整个集群的信息，例如节点、索引、映射和设置。

_cluster/health：获取集群健康状况。
_cluster/stats：获取集群统计信息。
_cluster/settings：获取或修改集群设置。
_cluster/pending_tasks：获取正在等待执行的集群任务。
Index API：索引API，用于管理索引的创建、删除、映射、搜索和更新。
_cat/indices：列出所有索引。
_cat/indices?v：列出所有索引，并显示一些详细信息。
_cat/indices/{index}：获取特定索引的详细信息。
_cat/indices/{index}/docs：获取特定索引的文档数量信息。
_cat/indices/{index}/shards：获取特定索引的分片信息。
_bulk：批量操作索引。

2、Document API：文档API，用于创建、更新、删除和搜索文档。

index：创建或更新文档。
get：获取文档。
delete：删除文档。
search：搜索文档。
Search API：搜索API，用于搜索和聚合文档。
_search：基本搜索。
_search/scroll：滚动搜索。
_count：计算匹配文档的数量。
_msearch：批量搜索。
_search/template：使用搜索模板搜索。

二、Logstash API接口：

1、Node API：节点API，用于管理Logstash节点。

_node/stats：获取节点统计信息。
_node/pipelines：获取节点上的管道列表。
_node/{node_id}/stats：获取指定节点的统计信息。
_node/{node_id}/hot_threads：获取指定节点上的热线程信息。

2、Pipeline API：管道API，用于创建、修改和删除Logstash管道。

_pipeline/pipeline_id：获取指定管道的配置。
_pipeline/pipeline_id/_simulate：模拟管道运行。
_pipeline/pipeline_id/_clear：清除指定管道的持久队列。
_pipeline/pipeline_id/_reorder：重新排序指定管道中的插件。

三、Kibana API接口：

1、Index API：索引API，用于管理Kibana中的索引。

GET /api/saved_objects/index-pattern：获取所有索引模式。
POST /api/saved_objects/index-pattern：创建一个索引模式。
PUT /api/saved_objects/index-pattern/{id}：更新一个索引模式。
DELETE /api/saved_objects/index-pattern/{id}：删除一个索引模式。

2、Visualization API：可视化API，用于管理Kibana中的可视化对象。

GET /api/saved_objects/visualization：获取所有可视化对象。
POST /api/saved_objects/visualization：创建一个可视化对象。
PUT /api/saved_objects/visualization/{id}：更新一个可视化对象。
DELETE /api/saved_objects/visualization/{id}：删除一个可视化对象

四、ELK的增删改查

Elasticsearch的curl命令可以通过HTTP请求进行索引、查询、删除：

1、索引（Index）操作：

PUT /index_name/document_type/document_id：在指定索引中创建新文档，document_id为可选项。
参数说明：

index_name：索引名称，必选项。
document_type：文档类型，可选项，建议使用_mapping中定义的类型。
document_id：文档ID，可选项，若不指定系统会自动生成一个唯一ID。

示例：

curl -XPUT 'http://localhost:9200/my_index/my_type/1' -d '{ "title": "Elasticsearch is Awesome", "tags": ["elasticsearch", "bigdata"] }'

2、查询（Search）操作：

GET /index_name/_search：搜索所有文档。
GET /index_name/document_type/_search：搜索指定类型的文档。
POST /index_name/document_type/_search：搜索指定类型的文档。

参数说明：

index_name：索引名称，必选项。
document_type：文档类型，可选项，建议使用_mapping中定义的类型。
q：查询语句，可选项，支持通配符、逻辑操作符等。

示例：

curl -XGET 'http://localhost:9200/my_index/_search?q=title:elasticsearch'

3、删除（Delete）操作：

DELETE /index_name/document_type/document_id：删除指定文档。
DELETE /index_name/document_type：删除指定类型的所有文档。
DELETE /index_name：删除指定索引及其中的所有文档。

参数说明：

index_name：索引名称，必选项。
document_type：文档类型，可选项，建议使用_mapping中定义的类型。
document_id：文档ID，可选项，若不指定系统会自动生成一个唯一ID。

示例：

curl -XDELETE 'http://localhost:9200/my_index/my_type/1'

4、更新（Update）操作：

POST /index_name/document_type/document_id/_update：更新指定文档。

参数说明：

index_name：索引名称，必选项。
document_type：文档类型，可选项，建议使用_mapping中定义的类型。
document_id：文档ID，必选项。
script：脚本，必选项，用于更新文档。
params：脚本中的参数，可选项。

示例：

curl -XPOST 'http://localhost:9200/my_index/my_type/1/_update' -d '{ "script": "ctx._source.title = 'Elasticsearch is Awesome'", "params": {"tag": "bigdata"} }'

五：elasticdump Examples:

##安装elasticdump/前提安装nodejs

root@bdp-1:~# npm install elasticdump -g
root@bdp-1:~# cd /usr/local/nodejs/lib/node_modules/elasticdump/bin

注：整体迁移版本不变，直接拷贝整个文件目录
##使用映射将索引从生产复制到暂存：备份数据导入另一个es集群


elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=http://staging.es.com:9200/my_index \
  --type=mapping

elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=http://staging.es.com:9200/my_index \
  --type=data

##将索引数据备份到文件：

#备份：
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=/data/my_index_mapping.json \
  --type=mapping
#还原：
elasticdump --input=/data/my_index_mapping.json \
--output http://127.0.0.1:9200/  --type=mapping
#备份
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=/data/my_index.json \
  --type=data
  #还原
  elasticdump --input=/opt/my_index.json \
  --output=http://127.0.0.1:9200/my_index \
  --type=data

##使用stdout备份和索引到gzip：

elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=$ \
  | gzip > /data/my_index.json.gz

##将查询结果备份到文件

elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=query.json \

＃将ES索引及其所有类型备份到es_backup文件夹中
multielasticdump direction = dump match ='^.*$'  input = http://127.0.0.1:9200   output =/tmp/es_backup

＃仅备份ES索引以“ -index”（匹配正则表达式）为前缀的结尾。仅备份索引数据。所有其他类型都将被忽略。＃注意：默认情况下会忽略分析器和别名类型
multielasticdump --direction=dump --match='^.*-index$' --input=http://127.0.0.1:9200 --ignoreType='mapping,settings,template'  --output=/tmp/es_backup
# 使用elasticdump进行多个索引还原操作：
multielasticdump --direction=load --input=/tmp/es_backup --output=http://127.0.0.1:9200

六、elastic调整最大分片数

报错

elk logstash报错。 logstash.outputs.elasticsearch][main] Could not index
event to Elasticsearch. {:status=>400, :action=>[“index”,********this
action would add [2] total shards, but this cluster currently has
[999]/[1000] maximum shards open;"}}}}

优化方案
1、elasticsearch 调整最大分片数量限制，取消系统限制
2、logstash配置限制最大发送数量大小
3、elk启用数据压缩
4、优化kibana查询索引策略


临时：
PUT /_cluster/settings
{
  "transient": {
    "cluster": {
      "max_shards_per_node":900000
    }
  }
}

永久：
PUT /_cluster/settings
{
  "persistent": {
    "cluster": {
      "max_shards_per_node":900000
    }
  }
}
永久elasticsearch.yml

cluster.max_shards_per_node: 900000

修改系统配置

/etc/security/limit.conf

root soft nofile 65535
root hard nofile 65535
* soft nofile 65535
* hard nofile 65535
* soft nproc 65535
* hard nproc 65535
* soft memlock unlimited
* hard memlock unlimited



检查配置问题
GET /_cluster/settings?pretty
 
{
  "persistent" : {
    "xpack" : {
      "monitoring" : {
        "collection" : {
          "enabled" : "true"
        }
      }
    }
  },
  "transient" : {
    "cluster" : {
      "max_shards_per_node" : "900000"
    }
  }
}

限制内存elastic、logstash

解决：1、在elasticsearch.yml添加配置：

indices.breaker.total.use_real_memory：false
indices.breaker.total.limit: 70%

2、修改ES的jvm.options：

-Xms10g
-Xmx10g
## GC configuration
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=60
-XX:+UseCMSInitiatingOccupancyOnly

断路器

indices.breaker.fielddata.limit
fielddata 断路器默认设置堆的 60% 作为 fielddata 大小的上限。

indices.breaker.request.limit
request 断路器估算需要完成其他请求部分的结构大小，例如创建一个聚合桶，默认限制是堆内存的 40%。

indices.breaker.total.limit
total 揉合 request 和 fielddata 断路器保证两者组合起来不会使用超过堆内存的 70%。

indices.fielddata.cache.size
缓存回收大小，无默认值， 有了这个设置，最久未使用（LRU）的 fielddata 会被回收为新数据腾出空间

前三项可以动态设置,最后一项要在配置文件中修改
例如：

PUT /_cluster/settings
{
  "persistent": {
    "indices.breaker.fielddata.limit": "60%"
  }
} 


PUT /_cluster/settings
{
  "persistent": {
    "indices.breaker.request.limit": "40%"
  }
} 


PUT /_cluster/settings
{
  "persistent": {
    "indices.breaker.total.limit": "70%"
  }
} 

最后一项要在elasticsearch.yml中配置

#------------有了这个设置，最久未使用（LRU）的 fielddata 会被回收为新数据腾出空间   
indices.fielddata.cache.size:  40%

当前fieldData缓存区大小     <   indices.fielddata.cache.size 
当前fieldData缓存区大小   +  下一个查询加载进来的fieldData   <   indices.breaker.fielddata.limit 
indices.breaker.request.limit  +  indices.breaker.fielddata.limit  <  indices.breaker.total.limit

七：esm迁移

1、特性

• https://github.com/medcl/esm

• Features 特性
– No dependency 不依赖
– Cross version migration supported 支持跨版本迁移
– Overwrite index name 覆盖索引名
– Copy index settings and mapping 复制索引设置和映射
– Support http basic auth 支持http基本授权
– Support dump index to local file 支持索引转储到本地文件
– Support loading index from local file 支持从本地文件加载索引
– Support http proxy 支持http代理
– Support sliced scroll ( elasticsearch 5.0 +) 支持切片滚动
– Support run in background 支持后台运行
– Generate testing data by randomize the source document id 通过随机化原文档id生成测试数据
– Support rename filed name 支持重命名文件名称
– Support unify document type name 支持统一文档类型名称
– Support specify which _source fields to return from source 支持指定要从源返回的源字段（_S）
– Support specify query string query to filter the data source 支持指定查询字符串查询过滤数据源
– Support rename source fields while do bulk indexing 在进行大容量索引时支持重命名源字段
– Support incremental update(add/update/delete changed records) with --sync. Notice: it use different implementation, just handle the changed records, but not as fast as the old way 使用–sync支持增量更新（添加/更新/删除已更改的记录）。注意：它使用不同的实现，只处理更改后的记录，但不如旧的方式快
– Load generating with 使用生成负载

2、命令示例

在ESM之前
在运行 esm 之前，请手动准备目标索引，并进行映射和优化设置以提高速度，例如：

PUT your-new-index
{
  "settings": {
    "index.translog.durability": "async", 
    "refresh_interval": "-1", 
    "number_of_shards": 10,
    "number_of_replicas": 0
  }
}
例：
将索引从 复制到index_name192.168.1.x192.168.1.y:9200

./bin/esm  -s http://192.168.1.x:9200   -d http://192.168.1.y:9200 -x index_name  -w=5 -b=10 -c 10000
将索引从 复制到 并保存为src_index192.168.1.x192.168.1.y:9200dest_index

./bin/esm -s http://localhost:9200 -d http://localhost:9200 -x src_index -y dest_index -w=5 -b=100
使用同步功能将索引从 到src_index192.168.1.x192.168.1.y:9200

./bin/esm --sync -s http://localhost:9200 -d http://localhost:9200 -x src_index -y dest_index
支持 Basic-Auth

./bin/esm -s http://localhost:9200 -x "src_index" -y "dest_index"  -d http://localhost:9201 -n admin:111111
复制设置并覆盖分片大小

./bin/esm -s http://localhost:9200 -x "src_index" -y "dest_index"  -d http://localhost:9201 -m admin:111111 -c 10000 --shards=50  --copy_settings

复制设置和映射，重新创建目标索引，向源提取添加查询，迁移后刷新

./bin/esm -s http://localhost:9200 -x "src_index" -q=query:phone -y "dest_index"  -d http://localhost:9201  -c 10000 --shards=5  --copy_settings --copy_mappings --force  --refresh

将 Elasticsearch 文档转储到本地文件

./bin/esm -s http://localhost:9200 -x "src_index"  -m admin:111111 -c 5000 -q=query:mixer  --refresh -o=dump.bin 
将源索引和目标索引转储到本地文件并进行比较，以便快速找到差异

./bin/esm --sort=_id -s http://localhost:9200 -x "src_index" --truncate_output --skip=_index -o=src.json
./bin/esm --sort=_id -s http://localhost:9200 -x "dst_index" --truncate_output --skip=_index -o=dst.json
diff -W 200 -ry --suppress-common-lines src.json dst.json
从转储文件加载数据，批量插入到另一个 ES 实例

./bin/esm -d http://localhost:9200 -y "dest_index"   -n admin:111111 -c 5000 -b 5 --refresh -i=dump.bin
支持代理

 ./bin/esm -d http://123345.ap-northeast-1.aws.found.io:9200 -y "dest_index"   -n admin:111111  -c 5000 -b 1 --refresh  -i dump.bin  --dest_proxy=http://127.0.0.1:9743
使用切片滚动（仅在 Elasticsearch v5 中可用）来加速滚动，并更新分片编号

 ./bin/esm -s=http://192.168.3.206:9200 -d=http://localhost:9200 -n=elastic:changeme -f --copy_settings --copy_mappings -x=bestbuykaggle  --sliced_scroll_size=5 --shards=50 --refresh
将 5.x 迁移到 6.x，并将所有类型统一到doc

./esm -s http://source_es:9200 -x "source_index*"  -u "doc" -w 10 -b 10 - -t "10m" -d https://target_es:9200 -m elastic:passwd -n elastic:passwd -c 5000 

要迁移版本 7.x，您可能需要重命名为_type_doc

./esm -s http://localhost:9201 -x "source" -y "target"  -d https://localhost:9200 --rename="_type:type,age:myage"  -u"_doc"

使用范围查询筛选迁移

./esm -s https://192.168.3.98:9200 -m elastic:password -o json.out -x kibana_sample_data_ecommerce -q "order_date:[2020-02-01T21:59:02+00:00 TO 2020-03-01T21:59:02+00:00]"

范围查询、关键字类型和转义

./esm -s https://192.168.3.98:9200 -m test:123 -o 1.txt -x test1  -q "@timestamp.keyword:[\"2021-01-17 03:41:20\" TO \"2021-03-17 03:41:20\"]"
生成测试数据，如果包含 10 个文档，以下命令将摄取 100 个文档，适合测试input.json

./bin/esm -i input.json -d  http://localhost:9201 -y target-index1  --regenerate_id  --repeat_times=10 
选择源字段

 ./bin/esm -s http://localhost:9201 -x my_index -o dump.json --fields=author,title
在执行批量索引时重命名字段

./bin/esm -i dump.json -d  http://localhost:9201 -y target-index41  --rename=title:newtitle
用户buffer_count来控制 ESM 使用的内存，并使用 gzip 压缩网络流量

./esm -s https://localhost:8000 -d https://localhost:8000 -x logs1kw -y logs122 -m elastic:medcl123 -n elastic:medcl123 --regenerate_id -w 20 --sliced_scroll_size=60 -b 5 --buffer_count=1000000 --compress false

3、帮助释义

下载
https://github.com/medcl/esm/releases

编译：
如果下载版本不适合您的环境，您可以尝试自行编译。 必填。go

make build

Go 版本 >= 1.7
选项
Usage:
  esm [OPTIONS]
Application Options:
  -s, --source=                    source elasticsearch instance, ie: http://localhost:9200
  -q, --query=                     query against source elasticsearch instance, filter data before migrate, ie: name:medcl ## 对源 Elasticsearch 实例的查询，迁移前过滤数据，即：name：medcl
      --sort=                      sort field when scroll, ie: _id (default: _id)    ### 滚动时的排序字段，即：_id（默认：_id）
  -d, --dest=                      destination elasticsearch instance, ie: http://localhost:9201 
  -m, --source_auth=               basic auth of source elasticsearch instance, ie: user:pass   ## 源 Elasticsearch 实例的基本身份验证
  -n, --dest_auth=                 basic auth of target elasticsearch instance, ie: user:pass   ## 目标 Elasticsearch 实例的基本身份验证
  -c, --count=                     number of documents at a time: ie "size" in the scroll request (10000)    ## 一次的文档数：即滚动请求中的“大小”（10000）
      --buffer_count=              number of buffered documents in memory (100000)    ## 内存中缓冲的文档数 （100000）
  -w, --workers=                   concurrency number for bulk workers (1)  ## 批量工作线程的并发数 （1）
  -b, --bulk_size=                 bulk size in MB (5)   ## 批量大小（以 MB 为单位） （5）

 -t, --time=                      scroll time (1m)   ##  滚动时间 （1m）
      --sliced_scroll_size=        size of sliced scroll, to make it work, the size should be > 1 (1)   ## 切片卷轴的大小，要使其工作，大小应> 1 （1）
  -f, --force                      delete destination index before copying    ##复制前强制删除目标索引
  -a, --all                        copy indexes starting with . and _   ## 复制所有以.和_开头的索引
      --copy_settings              copy index settings from source ## 从源复制索引设置
      --copy_mappings              copy index mappings from source  ## 从源复制索引映射
      --shards=                    set a number of shards on newly created indexes  ## 在新创建的索引上设置多个分片
  -x, --src_indexes=               indexes name to copy,support regex and comma separated list (_all)  ## 索引要复制的名称，支持正则表达式和逗号分隔的列表 （_all） 
  -y, --dest_index=                indexes name to save, allow only one indexname, original indexname will be used if not specified ## 索引名称保存，只允许一个索引名，如果未指定，将使用原始索引名
  -u, --type_override=             override type name ## 覆盖类型名称
      --green                      wait for both hosts cluster status to be green before dump. otherwise yellow is okay  ## 等待两个主机集群状态都为绿色，然后再转储。否则黄色是可以的
  -v, --log=                       setting log level,options:trace,debug,info  ## 设置日志级别，选项：跟踪，调试，信息

  -i, --input_file=                indexing from local dump file  ## 从本地转储文件建立索引
      --input_file_type=           the data type of input file, options: dump, json_line, json_array, log_line (dump)   ## 输入文件的数据类型，选项：dump、json_line、json_array、log_line（dump）
      --source_proxy=              set proxy to source http connections, ie: http://127.0.0.1:8080  ## 设置代理来获取 HTTP 连接
      --dest_proxy=                set proxy to target http connections, ie: http://127.0.0.1:8080  ## 将代理设置为目标 HTTP 连接
      --refresh                    refresh after migration finished  ## 迁移完成后刷新
      --sync=                      sync will use scroll for both source and target index, compare the data and sync(index/update/delete)  ## 将对源索引和目标索引都使用 scroll，比较数据和 sync（index/update/delete）
      --fields=                    filter source fields(white list), comma separated, ie: col1,col2,col3,...  ## 过滤源字段（白名单），逗号分隔，即：col1，col2，col3,...
      --skip=                      skip source fields(black list), comma separated, ie: col1,col2,col3,... ## （黑名单），逗号分隔，即：col1，col2，col3,...
      --rename=                    rename source fields, comma separated, ie: _type:type, name:myname  ## 重命名源字段，逗号分隔，即：_type：type，name：myname
  -l, --logstash_endpoint=         target logstash tcp endpoint, ie: 127.0.0.1:5055   ## 目标 logstash tcp 端点
      --secured_logstash_endpoint  target logstash tcp endpoint was secured by TLS  ## 目标 logstash tcp 端点受 TLS 保护
      --repeat_times=              repeat the data from source N times to dest output, use align with parameter regenerate_id to amplify the data size ## 重复源数据 N 次以最终输出，使用 align with 参数 regenerate_id 放大数据大小
  -r, --regenerate_id              regenerate id for documents, this will override the exist document id in data source ## 重新生成文档的 ID，这将覆盖数据源中存在的文档 ID
      --compress                   use gzip to compress traffic  ## 使用 gzip 压缩流量
  -p, --sleep=                     sleep N seconds after finished a bulk request (-1) ## 完成批量请求后的 N 秒 （-1）


Help Options:
  -h, --help                       Show this help message

常见问题

滚动 ID 太长，在源集群上更新。elasticsearch.yml
http.max_header_size: 16k
http.max_initial_line_length: 8k

4、实战

1、修改用户密码

##已知原密码
curl -XPOST -u admin:1111111 "127.0.0.1:9200/_security/user/elastic/_password" -H 'Content-Type: application/json' -d'{"password" : "elastic123456"}'


curl -XPOST -u admin "127.0.0.1:9200/_security/user/elastic/_password" -H 'Content-Type: application/json' -d'{"password" : "elastic123456"}'

##命令行
elasticsearch-users passwd admin

2、ESM迁移前准备


在ESM之前
在运行 esm 之前，请手动准备目标索引，并进行映射和优化设置以提高速度。
##试了报错，没用

##命令行admin:111111用户密码
 curl  -H'Content-Type: application/json'  -XPUT http://192.168.220.88:9200/union_merchant_base_info -u admin:111111 -d {
  "settings": { \
    "index.translog.durability": "async",  \
    "index.refresh_interval": "-1",  \
    "index.number_of_shards": 10, \
    "index.number_of_replicas": 0  \
  } \
}


##kibana

PUT union_merchant_base_info
{
  "settings": {
    "index.translog.durability": "async", 
    "refresh_interval": "-1", 
    "number_of_shards": 10,
    "number_of_replicas": 0
  }
}

3、迁移

####两边各多一条数据
在这里插入图片描述

在这里插入图片描述

##ecloud 2402  hw 2400
##多迁少数据覆盖，上述两条数据全部都有，数据2405
esm -s http://192.168.220.66:9200 -m "elastic:111111" -x "union_merchant_base_info" -y "union_merchant_base_info"  -d http://192.168.220.88:9200 -n  "elastic:111111"
##少迁多，上述两条数据全部都有，数据2405
esm -s http://192.168.220.88:9200 -m "elastic:111111" -x "union_merchant_base_info" -y "union_merchant_base_info"  -d http://192.168.220.66:9200 -n  "elastic:111111"

###只迁移某一行，已实验。

esm -s http://192.168.220.66:9200 -m "elastic:111111"  -q="_id":"240******891" -x "union_merchant_base_info" -y "union_merchant_base_info"  -d http://192.168.220.88:9200 -n  "elastic:111111"


esm -s http://192.168.220.88:9200 -m "elastic:111111"  -q="_id":"240******415" -x "union_merchant_base_info" -y "union_merchant_base_info"  -d http://192.168.220.66:9200 -n  "elastic:111111"```