Ceph RGW远程同步(multisite)机制研究

本文深入探讨了Ceph 10.2.3版本中RGW(Rados Gateway)的远程同步(multisite)机制。介绍了bucket实例、对象存储、数据日志池以及不同类型的shards。在写入对象时,Ceph类确保以有序的方式记录log,这是跨站点数据同步的关键。同步依赖于log的顺序处理,以保证数据一致性。
摘要由CSDN通过智能技术生成

说明:

   本文基于ceph 10.2.3的;

   远程同步配置,见http://blog.csdn.net/for_tech/article/details/68927956

   文中以secondary site为本地,所以local和secondary意思相同;remote和source意思相同;


+=======================================================================+

|                     Remote Site (Source Site)                         |
+=======================================================================+



0.1 Pool: {source-zone}.rgw.data.root

    Bucket instances;

0.2 Pool: {source-zone}.rgw.buckets.data

    Objects;

0.3 Pool: {source-zone}.rgw.buckets.index
    For example, we have 4 buckets, each bucket has 3 shards (rgw_override_bucket_index_max_shards=3)
    bucket-0 shard0     .dir.{key-of-bucket-0}.0
    bucket-0 shard1     .dir.{key-of-bucket-0}.1
    bucket-0 shard2     .dir.{key-of-bucket-0}.2
    bucket-1 shard0     .dir.{key-of-bucket-1}.0
    bucket-1 shard1     .dir.{key-of-bucket-1}.1
    bucket-1 shard2     .dir.{key-of-bucket-1}.2
    bucket-2 shard0     .dir.{key-of-bucket-2}.0
    bucket-2 shard1     .dir.{key-of-bucket-2}.1
    bucket-2 shard2     .dir.{key-of-bucket-2}.2
    bucket-3 shard0     .dir.{key-of-bucket-3}.0
    bucket-3 shard1     .dir.{key-of-bucket-3}.1
    bucket-3 shard2     .dir.{key-of-bucket-3}.2

0.4 Pool: {source-zone}.rgw.log

for example, rgw_data_log_num_shards=4

    data_log.0
    data_log.1
    data_log.2
    data_log.3

0.5 map between bucket shards to data log shards 

    there are

         {bucket-number}*rgw_override_bucket_index_max_shards (12 in our example)
    bucket shards, and
         rgw_data_log_num_shards (4 in our example)
    data log shards;

    These bucket shards are mapped to these data log shards, see function choose_oid()

        bucket-0 shard0 -------- \
        bucket-0 shard1            \
        bucket-0 shard2              \
        bucket-1 shard0                \
        bucket-1 shard1                   data_log.0
        bucket-1 shard2     map to        data_log.1
        bucket-2 shard0                   data_log.2
        bucket-2 shard1                   data_log.3
        bucket-2 shard2                /
        bucket-3 shard0              /
        bucket-3 shard1            /
        bucket-3 shard2 -------- /

      Important: there are 2 kinds of shards:
            data-log-shard:  [0, rgw_data_log_num_shards)
            bucket-shard  :  [0, rgw_override_bucket_index_max_shards)

0.6 Writing an object 

    When an object (OBJ_444) in put into bucket-B, shardS

     A. the object is written into pool  {source-zone}.rgw.buckets.data
     B. write log into pool={source-zone}.rgw.buckets.index, obj=.dir.{key-of-bucket-B}.S as KV pairs:
            OBJ_444             ==> info of OBJ_444, such as owner, content-type ...
            .0_00000000001.4.2  ==> write OBJ_444 state=CLS_RGW_STATE_PENDING_MODIFY
            .0_00000000002.5.3  ==> write OBJ_444 state=CLS_RGW_STATE_COMPLETE
        update omap header to

            .0_00000000002.5.3

        Important: these logs are written by the "Ceph classes", so that the markers (the omap key, such as
           .0_00000000001.4.2, .0_00000000002.5.3) are guaranteed ordered. The order is the key to sync data

           between sites (secondary site needs to process the logs in the same order.)

           "Ceph Classes": Ceph loads .so classes stored in the osd class dir directory dynamically
           (i.e., $libdir/rados-classes by default). When you implement a class, you can create new
           object methods that have the ability to call the native methods in the Ceph Object Store,
           or other class methods you incorporate via libraries or create yourself. On writes, Ceph
           Classes can call native or class methods, perform any series of operations on the inbound
           data and generate a resulting write transaction that Ceph will apply atomically. On reads,
           Ceph Classes can call native or class methods, perform any series of operations on the
           outbound data and return the data to the client.

     C. If bucket-B shardS is mapped to data_log.X, then
        write log into  pool={source-zone}.rgw.log,  obj=data_log.X as KV pair
            1_1489979397.374156_23.1 ==> some info like bucket-B shardS has modification, timestampT
        update omap header to

            1_1489979397.374156_23.1

        Important: these logs are written by the "

  • 2
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值