RGW 数据模型设计

2 篇文章 0 订阅

ceph是一个开源的统一分布式存储系统,RADOS是提供了底层基础对象存储服务,它由mon和osd组成。RADOS主要操作的对象有pool,object和object的xattr、omap。
rados gateway是基于RADOS的一个对象存储服务,对外提供了S3、swift和RESTful api接口,对外提供存储服务。
bucket和object(key)是rados gateway构造的两个主要的数据模型,本文主要是介绍gateway中bucket和key的设计。
bucket:是一个存放key的容器,也可以理解为一个目录,但是bucket不可以嵌套。
key:也可以称作对象,它代表这上传到存储服务中的一份完整数据。

接下来通过一组实际操作来介绍bucket和key的设计。
rados gateway中也构造了account、zone、region等数据结构,但不是本文介绍重点,这里就不做详细介绍。
要想在gateway中创建bucket,上传数据,首先要有创建一个用户拿到一对认证密钥(access_key、secret_key)。

gateway user

创建用户:

# radosgw-admin user create --uid=yankun --display-name=yankun
{
    "user_id": "yankun",
    "display_name": "yankun",
    "email": "",
    "suspended": 0,
    "max_buckets": 1000,
    "auid": 0,
    "subusers": [],
    "keys": [
        {
            "user": "yankun",
            "access_key": "FLNOEBKYFT7R0VA2ZH03",
            "secret_key": "2a3O5epEHpnRw26Rb6tukdYJz6nQes6hCoO5fIM3"
        }
    ],
    "swift_keys": [],
    "caps": [],
    "op_mask": "read, write, delete",
    "default_placement": "",
    "placement_tags": [],
    "bucket_quota": {
        "enabled": false,
        "max_size_kb": -1,
        "max_objects": -1
    },
    "user_quota": {
        "enabled": false,
        "max_size_kb": -1,
        "max_objects": -1
    },
    "temp_url_keys": []
}

创建用户之后就会获得access_key和secret_key,然后就使用s3cmd这个客户端来创建bucket,并上传数据。
在s3cmd的配置文件中,配置access_key、secret_key和服务地址。

RGW中的bucket

创建bucket

# s3cmd mb s3://where_is_my_bucket
# s3cmd mb s3://where_is_my_bucket1

查看bucket信息

# radosgw-admin bucket stats --bucket=where_is_my_bucket
{
    "bucket": "where_is_my_bucket",
    "pool": ".rgw.buckets",
    "index_pool": ".rgw.buckets.index",
    "id": "default.5762326.25",
    "marker": "default.5762326.25",
    "owner": "yankun",
    "ver": "0#9",
    "master_ver": "0#0",
    "mtime": "2017-09-12 10:16:47.000000",
    "max_marker": "0#",
    "usage": {
        "rgw.main": {
            "size_kb": 4105961,
            "size_kb_actual": 4105964,
            "num_objects": 3
        }
    },
    "bucket_quota": {
        "enabled": false,
        "max_size_kb": -1,
        "max_objects": -1
    }
}

bucket对象
用户创建的bucket都会保存在.users.uid pool 中对象yankun.buckets的omap中,key是bucket名字value是bucket的信息。.users.id中保存用户的用户名{username}和{username}.buckets

# rados -p .users.uid listomapkeys yankun.buckets
where_is_my_bucket
where_is_my_bucket1
# rados -p .users.uid getomapval yankun.buckets  where_is_my_bucket  binary_where_is_my_bucket                  
Writing to binary_where_is_my_bucket

# ceph-dencoder type RGWBucketEnt import binary_where_is_my_bucket decode dump_json
{
    "bucket": {
        "name": "where_is_my_bucket",
        "pool": ".rgw.buckets",
        "data_extra_pool": ".rgw.buckets.extra",
        "index_pool": ".rgw.buckets.index",
        "marker": "default.5762326.25",
        "bucket_id": "default.5762326.25"
    },
    "size": 4204504056,
    "size_rounded": 4204507136,
    "mtime": 1505182607,
    "count": 3
}

bucket在rados中的对象
每个bucket,rados都会为其在.rgw.buckets.index pool中创建一个对象,其命名格式为:.dir.{bucket_id}

# rados -p .rgw.buckets.index ls > .rgw.buckets.index                                                                                                                   
# grep default.5762326.25 .rgw.buckets.index 
.dir.default.5762326.25

bucket的元信息
bucket的元信息在rados中一个独立的对象保存在.rgw pool中的.bucket.meta.{bucket_name}:{marker}。

# rados -p .rgw ls
where_is_my_bucket1
.bucket.meta.where_is_my_bucket1:default.5762326.26
where_is_my_bucket
.bucket.meta.where_is_my_bucket:default.5762326.25
# rados -p .rgw get .bucket.meta.where_is_my_bucket:default.5762326.25 binary.bucket.meta.where_is_my_bucket:default.5762326.25
# ceph-dencoder type RGWBucketInfo  import .bucket.meta.where_is_my_bucket\:default.5762326.25  decode dump_json
{
    "bucket": {
        "name": "where_is_my_bucket",
        "pool": ".rgw.buckets",
        "data_extra_pool": ".rgw.buckets.extra",
        "index_pool": ".rgw.buckets.index",
        "marker": "default.5762326.25",
        "bucket_id": "default.5762326.25"
    },
    "creation_time": 1505182607,
    "owner": "yankun",
    "flags": 0,
    "region": "default",
    "placement_rule": "default-placement",
    "has_instance_obj": "true",
    "quota": {
        "enabled": false,
        "max_size_kb": -1,
        "max_objects": -1
    },
    "num_shards": 0,
    "bi_shard_hash_type": 0
}

bucket的acl保存在.bucket.meta.{bucket_name}:{marker}对象的xattr中。

#  rados -p .rgw getxattr .bucket.meta.where_is_my_bucket:default.5762326.25  user.rgw.acl > binary.bucket.acl
# ceph-dencoder type RGWAccessControlPolicy  import binary.bucket.acl  decode dump_json
{
    "acl": {
        "acl_user_map": [
            {
                "user": "yankun",
                "acl": 15
            }
        ],
        "acl_group_map": [],
        "grant_map": [
            {
                "id": "yankun",
                "grant": {
                    "type": {
                        "type": 0
                    },
                    "id": "yankun",
                    "email": "",
                    "permission": {
                        "flags": 15
                    },
                    "name": "yankun",
                    "group": 0
                }
            }
        ]
    },
    "owner": {
        "id": "yankun",
        "display_name": "yankun"
    }
}

RGW中的object

object只能保存在bucket中,这里构造了一个大文件where_is_my_object.txt,用于上传到bucket中。
构造大文件

#dd if=/dev/zero of=./where_is_my_object.txt bs=2M count=1000
# du where_is_my_object.txt -h
2.0G    where_is_my_object.txt

上传大文件到bucket中

#s3cmd put where_is_my_object.txt s3://where_is_my_bucket
upload: 'where_is_my_object.txt' -> 's3://where_is_my_bucket/where_is_my_object.txt'  [1 of 1]
 2097152000 of 2097152000   100% in  123s    16.24 MB/s  done

object与bucket之间的映射
文件上传到bucket where_is_my_bucket中该bucket的id为default.5762326.25,该对象与bucket的关系维护在.dir.{bucket_id}对象的omap中。

# rados -p .rgw.buckets.index listomapkeys .dir.default.5762326.25
where_is_my_object.txt

对象命名格式
上传的对象在rados中以一个对象存在或者多个对象存在,这主要看上传对象的大小。
对象的数据保存在.rgw.buckets pool中,如果上传数据大小大于512KB,则会保存多个对象,分别是一个头对象(512KB)和一个或者多个尾对象(默认4MB)。头对象命名格式为_,如where_is_my_bucket bucket中的where_is_my_object.txt对象在.rgw.buckets中的名字为:
default.5762326.25_where_is_my_object.txt;尾对象命名格式:{bucket_id}_shadow.{object_head:prefix}_{从1开始的自然序列}

# du default.5762326.25_where_is_my_object.txt 
512     default.5762326.25_where_is_my_object.txt
# du default.5762326.25__shadow_.h_oQhOgqDTmDZx2FUSm8zMTOlbhDQsq_99
4096    default.5762326.25__shadow_.h_oQhOgqDTmDZx2FUSm8zMTOlbhDQsq_99

对象的元信息
对象的元信息保存在头对象的xattr中

# rados -p .rgw.buckets listxattr default.5762326.25_where_is_my_object.txt
user.rgw.acl
user.rgw.content_type
user.rgw.etag
user.rgw.idtag
user.rgw.manifest
user.rgw.x-amz-date
user.rgw.x-amz-meta-s3cmd-attrs
user.rgw.x-amz-storage-class

对象的user.rgw.manifest属性

# rados -p .rgw.buckets getxattr default.5762326.25_where_is_my_object.txt ./binary.default.5762326.25_where_is_my_object.txt.user.rgw.manifest
# rados -p .rgw.buckets getxattr default.5762326.25_where_is_my_object.txt user.rgw.manifest > ./binary.default.5762326.25_where_is_my_object.txt.user.rgw.manifest
# ceph-dencoder type  RGWObjManifest import binary.default.5762326.25_where_is_my_object.txt.user.rgw.manifest  decode dump_json
{
    "objs": [],
    "obj_size": 2097152000,
    "explicit_objs": "false",
    "head_obj": {
        "bucket": {
            "name": "where_is_my_bucket",
            "pool": ".rgw.buckets",
            "data_extra_pool": ".rgw.buckets.extra",
            "index_pool": ".rgw.buckets.index",
            "marker": "default.5762326.25",
            "bucket_id": "default.5762326.25"
        },
        "key": "",
        "ns": "",
        "object": "where_is_my_object.txt",
        "instance": ""
    },
    "head_size": 524288,
    "max_head_size": 524288,
    "prefix": ".h_oQhOgqDTmDZx2FUSm8zMTOlbhDQsq_",
    "tail_bucket": {
        "name": "where_is_my_bucket",
        "pool": ".rgw.buckets",
        "data_extra_pool": ".rgw.buckets.extra",
        "index_pool": ".rgw.buckets.index",
        "marker": "default.5762326.25",
        "bucket_id": "default.5762326.25"
    },
    "rules": [
        {
            "key": 0,
            "val": {
                "start_part_num": 0,
                "start_ofs": 524288,
                "part_size": 0,
                "stripe_max_size": 4194304,
                "override_prefix": ""
            }
        }
    ]
}

Object ACL:

# rados -p .rgw.buckets getxattr default.5762326.25_where_is_my_object.txt  user.rgw.acl > binary.object.acl
# ceph-dencoder type RGWAccessControlPolicy  import binary.object.acl  decode dump_json
{
    "acl": {
        "acl_user_map": [
            {
                "user": "yankun",
                "acl": 15
            }
        ],
        "acl_group_map": [],
        "grant_map": [
            {
                "id": "yankun",
                "grant": {
                    "type": {
                        "type": 0
                    },
                    "id": "yankun",
                    "email": "",
                    "permission": {
                        "flags": 15
                    },
                    "name": "yankun",
                    "group": 0
                }
            }
        ]
    },
    "owner": {
        "id": "yankun",
        "display_name": "yankun"
    }
}

手动还原数据

根据object的模型设计,不通过rados gateway获取一份完整的对象。
构造一个对象

location_object
# du -h location_object
9.8M    location_object

本地对象md5值

# md5sum location_object 
24796d54d73d694168170135091f7eba  location_object

上传该对象到where_is_my_bucket

# s3cmd put location_object s3://where_is_my_bucket
upload: 'location_object' -> 's3://where_is_my_bucket/location_object'  [1 of 1]
 10200056 of 10200056   100% in    0s    77.72 MB/s
 10200056 of 10200056   100% in    4s     2.18 MB/s  done

对象切分
根据object的设计他会在rados中存在4个对象,一个头对象和3个尾对象。
头对象:default.5762326.25_location_object
尾对象:default.5762326.25__shadow_.{object_head:prefix}{1,2,3}

头对象

rados -p .rgw.buckets ls | grep location
default.5762326.25_location_object

该对象的prefix

# rados -p .rgw.buckets getxattr default.5762326.25_location_object user.rgw.manifest > ./binary.default.5762326.25_location_object.user.rgw.manifest
# ceph-dencoder type  RGWObjManifest import binary.default.5762326.25_location_object.user.rgw.manifest  decode dump_json
{
    "objs": [],
    "obj_size": 10200056,
    "explicit_objs": "false",
    "head_obj": {
        "bucket": {
            "name": "where_is_my_bucket",
            "pool": ".rgw.buckets",
            "data_extra_pool": ".rgw.buckets.extra",
            "index_pool": ".rgw.buckets.index",
            "marker": "default.5762326.25",
            "bucket_id": "default.5762326.25"
        },
        "key": "",
        "ns": "",
        "object": "location_object",
        "instance": ""
    },
    "head_size": 524288,
    "max_head_size": 524288,
    "prefix": ".Ux77sSsCN2UdioL5XxO0Hx8Ph9oXb35_",
    "tail_bucket": {
        "name": "where_is_my_bucket",
        "pool": ".rgw.buckets",
        "data_extra_pool": ".rgw.buckets.extra",
        "index_pool": ".rgw.buckets.index",
        "marker": "default.5762326.25",
        "bucket_id": "default.5762326.25"
    },
    "rules": [
        {
            "key": 0,
            "val": {
                "start_part_num": 0,
                "start_ofs": 524288,
                "part_size": 0,
                "stripe_max_size": 4194304,
                "override_prefix": ""
            }
        }
    ]
}

为对象为:default.5762326.25__shadow_.Ux77sSsCN2UdioL5XxO0Hx8Ph9oXb35_{1,2,3}
获取被切分的对象
使用rados来获取这些被切分的对象:

# rados -p .rgw.buckets get  default.5762326.25_location_object ./location_head
# rados -p .rgw.buckets get  default.5762326.25__shadow_.Ux77sSsCN2UdioL5XxO0Hx8Ph9oXb35_1 ./default.5762326.25__shadow_.Ux77sSsCN2UdioL5XxO0Hx8Ph9oXb35_1
# rados -p .rgw.buckets get  default.5762326.25__shadow_.Ux77sSsCN2UdioL5XxO0Hx8Ph9oXb35_2 ./default.5762326.25__shadow_.Ux77sSsCN2UdioL5XxO0Hx8Ph9oXb35_2
# rados -p .rgw.buckets get  default.5762326.25__shadow_.Ux77sSsCN2UdioL5XxO0Hx8Ph9oXb35_3 ./default.5762326.25__shadow_.Ux77sSsCN2UdioL5XxO0Hx8Ph9oXb35_3

拼接该对象

# cat location_head >  new_location_object
# cat default.5762326.25__shadow_.Ux77sSsCN2UdioL5XxO0Hx8Ph9oXb35_1  >>  new_location_object
# cat default.5762326.25__shadow_.Ux77sSsCN2UdioL5XxO0Hx8Ph9oXb35_2  >>  new_location_object
# cat default.5762326.25__shadow_.Ux77sSsCN2UdioL5XxO0Hx8Ph9oXb35_3  >>  new_location_object

new_location_object的md5值

# md5sum new_location_object 
24796d54d73d694168170135091f7eba  new_location_object

注:拉取拼接后的对象与之前的对象md5值相同,内容没有发生变化。

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值