es Snapshot and Restore

Overview

整理一下es的snapshot功能,分两块,一块是本地磁盘disk存储,一块是远程hdfs作存储,目录如下,

 

0. Overview
1. Version
2. Install plugin
3. Disk
   - create repo
   - create snapshot
   - restore
   - setp
4. HDFS
   - create hdfs repo
   - insert data
   - create hdfs snapshot
   - restore from hdfs
5. Restoring to a different cluster
   - registering repository
   - list snapshot
   - starting restore from a snapshot 
6. benchmark
   - snapshoting speed
   - restoring speed
7. plugin auto route
8. other
9. Reference

Version

  • elasticsearch-5.4.3.zip
  • repository-hdfs-5.4.3.zip

Install plugin

 

# need to specified absolute path
bin/elasticsearch-plugin install file:///data/mapleleaf/es_snapshot/repository-hdfs-5.4.3.zip

# check hdfs master namenode ip and port using webhdfs
curl -i "http://localhost:8081/webhdfs/v1/?op=LISTSTATUS"

# start es
sh bin/elasticsearch -d
ps aux | grep elasticsearch | grep -v "grep" | awk '{print $2}' | xargs kill -9
ps aux | grep elasticsearch | grep -v "grep" | awk '{print $2}' | xargs kill -9 ; sleep 3 && sh bin/elasticsearch -d && ps aux | grep elasticsearch | grep -v "grep" && tailf logs/es_snap.log

Disk

create repo

 

# add below line to esyml
path.repo: ["/data/mapleleaf/es_snapshot/my_backup"]

# create repo, named: my_backup
curl -XPUT 'http://localhost:9200/_snapshot/my_backup' -H 'Content-Type: application/json' -d '{
    "type": "fs",
    "settings": {
        "location": "/data/mapleleaf/es_snapshot/my_backup",
        "compress": true
    }
}'

curl -X GET "localhost:9200/_snapshot/my_backup?pretty"
curl -X DELETE "localhost:9200/_snapshot/my_backup"

create snapshot

 

# create snapshot
curl -X PUT "localhost:9200/_snapshot/my_backup/snapshot_1?wait_for_completion=true&pretty"
curl -X GET "localhost:9200/_snapshot/my_backup/*?pretty"
curl -X GET "localhost:9200/_snapshot/my_backup/snapshot_1/_status?pretty"
curl -X DELETE "localhost:9200/_snapshot/my_backup/snapshot_1?pretty"

restore

 

# restore
curl -X POST "localhost:9200/_snapshot/my_backup/snapshot_1/_restore?pretty"

setp

  1. check index

 

curl -X PUT "localhost:9200/customer" -H 'Content-Type: application/json' -d'
{
    "settings" : {
        "index" : {
            "number_of_shards" : 5, 
            "number_of_replicas" : 0 
        }
    }
}
'

curl -X GET "localhost:9200/_cat/indices?v"
curl -X DELETE "localhost:9200/customer?pretty"
  1. insert data

 

for i in {1..10000};
do
    curl -s -X POST "localhost:9200/customer/external/?pretty" -H 'Content-Type: application/json' -d"
    {
      \"id\": ${i},
      \"num\": ${i},
      \"name\": \"John Doe\"
    }" > /dev/null
done

insert docs

  1. close index

 

curl -X POST "localhost:9200/customer/_close?pretty"
  1. restore
    因为之前我store了一次backup,当时backup只有1条doc,当插入1万条之后,close,然后restore,是以当时store的snapshot来恢复。

after restore

  1. reinsert

 

curl -X GET "localhost:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
    "query": {
        "match_all": {}
    }
}'

reinsert

  1. create snapshot_2

before

after

7 close & restore


HDFS

create hdfs repo

 

curl -X PUT "localhost:9200/_snapshot/my_hdfs_repository?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "hdfs",
  "settings": {
    "uri": "hdfs://xxxxx:xxxx",
    "path": "elasticsearch/respositories/my_hdfs_repository",
    "compress": true
  }
}'

如果在这一步出现异常,可以参考这里

create repo successed

insert data

doc 10000

create hdfs snapshot

 

curl -X PUT "localhost:9200/_snapshot/my_hdfs_repository/snapshot_hdfs_1?wait_for_completion=true&pretty"

access_control_exception

jvm.optiopns添加插件的安全配置

fix access_control_exception

 

create snap successed

hdfs ls snapshot files

restore from hdfs

  1. 随意增加一些docs,使得与snapshot时的index有差异,便于观察restore效果。

doc 10000+

  1. close index

doc index close

  1. restore
    curl -X POST "localhost:9200/_snapshot/my_hdfs_repository/snapshot_hdfs_1/_restore?pretty"

restore successed

doc 10000


Restoring to a different cluster

All that is required is registering the repository containing the snapshot in the new cluster and starting the restore process.

 

curl -X GET "localhost:9201/_cat/indices?v"

clusterB initial

registering repository

 

curl -X PUT "localhost:9201/_snapshot/my_hdfs_repository?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "hdfs",
  "settings": {
    "uri": "hdfs://xxxxx:xxxx",
    "path": "elasticsearch/respositories/my_hdfs_repository",
    "compress": true
  }
}'

registering using the same hdfs path with clusterA

list snapshot

 

curl -X GET "localhost:9201/_snapshot/my_hdfs_repository/*?pretty"

lists working snapshots

starting restore

 

curl -X POST "localhost:9201/_snapshot/my_hdfs_repository/snapshot_hdfs_1/_restore?pretty"

restore successed


benchmark

会用esrally将数据写入

before

snapshoting speed

hdfs before snapshot

 

# backgroud running
curl -X PUT "XXX:9200/_snapshot/my_hdfs_repository/snapshot_hdfs_long_1" -H 'Content-Type: application/json' -d'
{
  "indices": "591_etl_fuhaochen_test_2018062500",
  "ignore_unavailable": true,
  "include_global_state": false
}'

# check running status
curl -X GET "XXX:9200/_snapshot/my_hdfs_repository/*?pretty"

in_progress

success

hdfs after snapshot

restoring speed

 

date
curl -X POST "XXX:9201/_snapshot/my_hdfs_repository/snapshot_hdfs_long_1/_restore?wait_for_completion=true&pretty"
date

after

snapshoting耗时远比restoring高。


plugin auto route

测试一下插件会不会自动路由,即是否需要在每一个节点(datanode,masternode等)都安装?还是只需要在整个es集群的其中一个node安装之后,该node就会将plugin自动路由安装到集群的其他node上?

health

nodes

plugins

自动路由不可用。


other

  • 尝试snapshot更大的index,但是报错了,配置应该没有问题(因为小索引是snapshot成功的)

大索引snapshot失败

小索引snapshot成功

Self-suppression not permitted这个error应该是hadoop的DataNode剩余空间不够导致。


Reference



作者:chenfh5
链接:https://www.jianshu.com/p/b96070781ecb
来源:简书
著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值