使用ElasticSearch-dump进行数据迁移、备份_multielasticdump

最新推荐文章于 2024-07-17 14:00:15 发布

2401_85599032

最新推荐文章于 2024-07-17 14:00:15 发布

阅读量222

点赞数 5

文章标签： elasticsearch 大数据搜索引擎

本文链接：https://blog.csdn.net/2401_85599032/article/details/139942983

版权

npm install elasticdump

出现安装成功提示

+ elasticdump@6.72.0
added 112 packages from 198 contributors and audited 112 packages in 19.171s

安装成功后会在当前目录生成node_modules目录，里面包含 elasticdump 主目录

bin 目录下面有两个可执行文件 elasticdump（单索引操作） 、multielasticdump（多索引操作）

为了方便使用最好配置个环境变量

vim ~/.bashrc
# 追加以下内容
#node 
export DUMP_HOME=/root/node_modules/elasticdump
export PATH=$DUMP\_HOME/bin:$PATH
# 刷新
source ~/.bashrc

.
├── bin
│   ├── elasticdump
│   └── multielasticdump
├── elasticdump.js
├── lib
│   ├── add-auth.js
│   ├── argv.js
│   ├── aws4signer.js
│   ├── help.txt
│   ├── ioHelper.js
│   ├── is-url.js
│   ├── jsonparser.js
│   ├── parse-base-url.js
│   ├── parse-meta-data.js
│   ├── processor.js
│   ├── splitters
│   ├── transports
│   └── version-check.js
├── LICENSE.txt
├── package.json
├── README.md
└── transforms
    └── anonymize.js

二、使用

2.1 elasticdump 使用方法

# Copy an index from production to staging with analyzer and mapping:
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=http://staging.es.com:9200/my_index \
  --type=analyzer
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=http://staging.es.com:9200/my_index \
  --type=mapping
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=http://staging.es.com:9200/my_index \
  --type=data

# Backup index data to a file:
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=/data/my_index_mapping.json \
  --type=mapping
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=/data/my_index.json \
  --type=data

# Backup and index to a gzip using stdout:
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=$ \
  | gzip > /data/my_index.json.gz

# Backup the results of a query to a file
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=query.json \
  --searchBody="{\"query\":{\"term\":{\"username\": \"admin\"}}}"
  
# Specify searchBody from a file
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=query.json \
  --searchBody=@/data/searchbody.json  

# Copy a single shard data:
elasticdump \
  --input=http://es.com:9200/api \
  --output=http://es.com:9200/api2 \
  --input-params="{\"preference\":\"\_shards:0\"}"

# Backup aliases to a file
elasticdump \
  --input=http://es.com:9200/index-name/alias-filter \
  --output=alias.json \
  --type=alias

# Import aliases into ES
elasticdump \
  --input=./alias.json \
  --output=http://es.com:9200 \
  --type=alias

# Backup templates to a file
elasticdump \
  --input=http://es.com:9200/template-filter \
  --output=templates.json \
  --type=template

# Import templates into ES
elasticdump \
  --input=./templates.json \
  --output=http://es.com:9200 \
  --type=template

# Split files into multiple parts
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=/data/my_index.json \
  --fileSize=10mb

# Import data from S3 into ES (using s3urls)
elasticdump \
  --s3AccessKeyId "${access\_key\_id}" \
  --s3SecretAccessKey "${access\_key\_secret}" \
  --input "s3://${bucket\_name}/${file\_name}.json" \
  --output=http://production.es.com:9200/my_index

# Export ES data to S3 (using s3urls)
elasticdump \
  --s3AccessKeyId "${access\_key\_id}" \
  --s3SecretAccessKey "${access\_key\_secret}" \
  --input=http://production.es.com:9200/my_index \
  --output "s3://${bucket\_name}/${file\_name}.json"

# Import data from MINIO (s3 compatible) into ES (using s3urls)
elasticdump \
  --s3AccessKeyId "${access\_key\_id}" \
  --s3SecretAccessKey "${access\_key\_secret}" \
  --input "s3://${bucket\_name}/${file\_name}.json" \
  --output=http://production.es.com:9200/my_index
  --s3ForcePathStyle true
  --s3Endpoint https://production.minio.co

# Export ES data to MINIO (s3 compatible) (using s3urls)
elasticdump \
  --s3AccessKeyId "${access\_key\_id}" \
  --s3SecretAccessKey "${access\_key\_secret}" \
  --input=http://production.es.com:9200/my_index \
  --output "s3://${bucket\_name}/${file\_name}.json"
  --s3ForcePathStyle true
  --s3Endpoint https://production.minio.co

# Import data from CSV file into ES (using csvurls)
elasticdump \
  # csv:// prefix must be included to allow parsing of csv files
  # --input "csv://${file\_path}.csv" \
  --input "csv:///data/cars.csv"
  --output=http://production.es.com:9200/my_index \
  --csvSkipRows 1    # used to skip parsed rows (this does not include the headers row)
  --csvDelimiter ";" # default csvDelimiter is ','

2.2 multielasticdump 使用方法

# backup ES indices & all their type to the es\_backup folder
multielasticdump \
  --direction=dump \
  --match='^.\*$' \
  --input=http://production.es.com:9200 \
  --output=/tmp/es_backup

# Only backup ES indices ending with a prefix of `-index` (match regex). 
# Only the indices data will be backed up. All other types are ignored.
# NB: analyzer & alias types are ignored by default
multielasticdump \
  --direction=dump \
  --match='^.\*-index$'\
  --input=http://production.es.com:9200 \
  --ignoreType='mapping,settings,template' \
  --output=/tmp/es_backup

常用参数：

--direction  dump/load 导出/导入
--ignoreType  被忽略的类型，data,mapping,analyzer,alias,settings,template
--includeType  包含的类型，data,mapping,analyzer,alias,settings,template
--suffix  加前缀，es6-${index}
--prefix  加后缀，${index}-backup-2018-03-13

三、实战

源es地址：http://192.168.1.140:9200
源es索引名：source_index
目标es地址：http://192.168.1.141:9200
目标es索引名：target_index

3.1 迁移

3.1.1 在线迁移

直接将两个ES的数据同步

单索引

elasticdump \
  --input=http://192.168.1.140:9200/source_index \
  --output=http://192.168.1.141:9200/target_index \
  --type=mapping
elasticdump \
  --input=http://192.168.1.140:9200/source_index \
  --output=http://192.168.1.141:9200/target_index \
  --type=data \
  --limit=2000  # 每次操作的objects数量，默认100，数据量大的话，可以调大加快迁移速度

3.1.2 离线迁移

单索引
将源es索引数据导出为json文件，然后再导入目标es

# 导出
elasticdump \
  --input=http://192.168.1.140:9200/source_index \
  --output=/data/source_index_mapping.json \
  --type=mapping
elasticdump \
  --input=http://192.168.1.140:9200/source_index \
  --output=/data/source_index.json \
  --type=data \
  --limit=2000
# 导入
elasticdump \
  --input=/data/source_index_mapping.json \
  --output=http://192.168.1.141:9200/source_index \
  --type=mapping
elasticdump \
  --input=/data/source_index.json \
  --output=http://192.168.1.141:9200/source_index \
  --type=data \
  --limit=2000

全索引

# 导出
multielasticdump \
  --direction=dump \
  --match='^.\*$' \
  --input=http://192.168.1.140:9200 \
  --output=/tmp/es_backup \
  --includeType='data,mapping' \
  --limit=2000
# 导入
multielasticdump \
  --direction=load \
  --match='^.\*$' \
  --input=/tmp/es_backup \
  --output=http://192.168.1.141:9200 \
  --includeType='data,mapping' \
  --limit=2000 \

3.2 备份

3.2.1 单索引

将es索引备份成gz文件，减少储存压力

elasticdump \
  --input=http://192.168.1.140:9200/source_index \
  --output=$ \
  --limit=2000 \
  | gzip > /data/source_index.json.gz

四、脚本

单索引在线迁移

#!/bin/bash
echo -n "源ES地址: "
read source_es
echo -n "目标ES地址: "
read target_es
echo -n "源索引名: "
read source_index
echo -n "目标索引名: "
read target_index


![img](https://img-blog.csdnimg.cn/img_convert/3185dffd338a0624f41dfd87dfb57e1e.png)
![img](https://img-blog.csdnimg.cn/img_convert/f75c4dcb7dd3069ac463489d1642f3bf.png)

**网上学习资料一大堆，但如果学到的知识不成体系，遇到问题时只是浅尝辄止，不再深入研究，那么很难做到真正的技术提升。**

**一个人可以走的很快，但一群人才能走的更远！不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人，都欢迎加入我们的的圈子（技术交流、学习资源、职场吐槽、大厂内推、面试辅导），让我们一起学习成长！**

cho -n "目标ES地址: "
read target_es
echo -n "源索引名: "
read source_index
echo -n "目标索引名: "
read target_index


[外链图片转存中...(img-04mKN7UG-1719252614333)]
[外链图片转存中...(img-pOVZvYwW-1719252614334)]

**网上学习资料一大堆，但如果学到的知识不成体系，遇到问题时只是浅尝辄止，不再深入研究，那么很难做到真正的技术提升。**

**一个人可以走的很快，但一群人才能走的更远！不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人，都欢迎加入我们的的圈子（技术交流、学习资源、职场吐槽、大厂内推、面试辅导），让我们一起学习成长！**