ceph系列二、数据布局

1、RGW中三种数据类型

  • data
  • metadata
  • bucket.index

2、metadata

  • user:保存user信息

  • bucket:维护bucket name和bucket instance id的映射

  • bucket.instance:保存bucket instance信息

    1.获取元数据列表
    # radosgw-admin metadata list
    [
        "bucket",
        "bucket.instance",
        "user"
    ]
    
    2.获取指定类型的元数据(用户)
    # radosgw-admin metadata list user
    [
        "Testuser004",
        "Testuser010"
    ]
    
    获取用户列表
    # rados ls -p default.rgw.users.uid  
    Testuser010.buckets				#存放uid下的bucket
    Testuser010
    
    获取指定类型的制定元数据(用户)
    # radosgw-admin metadata get user:Testuser010
    {
        "key": "user:Testuser010",
        "ver": {
            "tag": "_6hQh8UuA-jHMgkrjVZLPCqJ",
            "ver": 1
        },
        "mtime": "2019-03-22 07:15:24.918482Z",
        ....
        ....
    }
    
    获取用户信息(类似上面功能)
    # radosgw-admin user info --uid Testuser010
    {
        "user_id": "Testuser010",
        "display_name": "Testuser010",
        "email": "",
        "suspended": 0,
        "max_buckets": 50,
        "auid": 0,
        "subusers": [],
        "keys": [
            {
                "user": "Testuser010",
                "access_key": "DZYG1N5L96QZ2I1T7AAA",
                "secret_key": "5zrxnewNvNJLkdxAusLJvmijx0RMJvN11XNZOaRx"
            }
        ],
        ...
    }
    
    3.获取指定类型的元数据(bucket)
    # radosgw-admin metadata list bucket
    [
        "test_1_bucket",
        "test_2_bucket"
    ]
    
    获取bucket列表(与上面功能类似)
    # radosgw-admin bucket list
    [
        "test_1_bucket",
        "test_2_bucket"
    ]
    
    获取指定类型的制定元数据(bucket)
    # radosgw-admin metadata get bucket:test_1_bucket
    {
        "key": "bucket:test_1_bucket",
        "ver": {
            "tag": "_XdBv9-vGPimZJup6s9yszc0",
            "ver": 1
        },
        ...
    }
    
    获取bucket元数据(与上面功能类似)
    # radosgw-admin bucket stats --bucket test_1_bucket
    {
        "bucket": "test_1_bucket",
        "pool": "default.rgw.buckets.data",
        "index_pool": "default.rgw.buckets.index",
        "usage": {
            "rgw.main": {
                "size_kb": 0,
                "size_kb_actual": 0,
                "num_objects": 0
            },
            "rgw.multimeta": {
                "size_kb": 0,
                "size_kb_actual": 0,
                "num_objects": 0
            }
        },
    }
    
    4.获取指定类型元数据(bucket:markerid)
    # radosgw-admin metadata list bucket.instance
    [
        "test_1_bucket:28570f88-31e7-4166-b6d7-d22903cead75.41856.22",
        "test_2_bucket:28570f88-31e7-4166-b6d7-d22903cead75.54954.38"
    ]
    
    # radosgw-admin metadata get bucket.instance:test_1_bucket:28570f88-31e7-4166-b6d7-d22903cead75.41856.22
    {
        "key": "bucket.instance:test_1_bucket:28570f88-31e7-4166-b6d7-d22903cead75.41856.22",
        "ver": {
            "tag": "_YE08tNomA4YnZsO7rhytSGm",
            "ver": 1
        }
    }
    
    注:get metadata bucket.instance返回信息几乎包含get metadata bucket的返回信息,也几乎包含bucket stats的返回信息,不过缺少bucket使用量的统计。
    

3、bucket.index

​ bucket.index对象保存在.rgw.buckets.index,名为“.dir.”的rados object。若启动shard,一个bucket对应多个rados object。通过bucket stats返回的"max_marker": "0#,1#,2#,3#,4#"可以查看分片。

​ bucket.index维护着一个k-v map即omap,key为object name,value为object的基本元数据

1.获取index中的rados objects,由.dir.+markerid+shardid组成
# rados ls -p default.rgw.buckets.index
.dir.28570f88-31e7-4166-b6d7-d22903cead75.207816.5
.dir.28570f88-31e7-4166-b6d7-d22903cead75.94720.6
.dir.28570f88-31e7-4166-b6d7-d22903cead75.94567.4
.dir.28570f88-31e7-4166-b6d7-d22903cead75.54360.9
.dir.28570f88-31e7-4166-b6d7-d22903cead75.165570.3
.dir.28570f88-31e7-4166-b6d7-d22903cead75.54885.16

2.查看bukcet下的object
# rados listomapkeys .dir.28570f88-31e7-4166-b6d7-d22903cead75.14416.4 -p default.rgw.buckets.index 
2M.log
_multipart_aaa.txt.2~3ROegUxcXcW7m_ON_xgp6eJa5VwHP_I.meta
_multipart_aaa.txt.2~s6Cx2uosNOtLuCMVrhpGL2-A-tlNKMh.meta
_multipart_aaa.txt.2~uNPesd6EYO-UHwpoItzgU46NVpf-7IT.meta
aaaa5.txt
helloworld_10
upload-file1

3.获取object的元数据(元数据展示固定的部分,并非object本身)
# rados listomapvals .dir.28570f88-31e7-4166-b6d7-d22903cead75.14416.4 -p default.rgw.buckets.index
2M.log
value (269 bytes) :
00000000  08 03 07 01 00 00 06 00  00 00 32 4d 2e 6c 6f 67  |..........2M.log|
00000010  39 00 00 00 00 00 00 00  01 05 03 9b 00 00 00 01  |9...............|
00000020  58 00 20 00 00 00 00 00  32 c9 09 5e 2e b3 f6 0c  |X. .....2..^....|
00000030  20 00 00 00 30 32 61 63  37 30 64 38 64 31 34 31  | ...02ac70d8d141|
00000040  66 61 38 65 64 66 37 63  30 33 62 62 30 31 39 61  |fa8edf7c03bb019a|
00000050  30 35 37 64 20 00 00 00  30 42 33 45 35 35 36 31  |057d ...0B3E5561|
00000060  44 39 44 31 34 36 42 39  42 41 30 37 41 37 33 44  |D9D146B9BA07A73D|
00000070  35 42 30 38 42 36 31 41  24 00 00 00 61 70 70 5f  |5B08B61A$...app_|
00000080  30 42 33 45 35 35 36 31  44 39 44 31 34 36 42 39  |0B3E5561D9D146B9|
00000090  42 41 30 37 41 37 33 44  35 42 30 38 42 36 31 41  |BA07A73D5B08B61A|
000000a0  0a 00 00 00 74 65 78 74  2f 70 6c 61 69 6e 58 00  |....text/plainX.|
000000b0  20 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  | ...............|
000000c0  00 00 01 01 02 00 00 00  17 39 81 b9 2f 00 00 00  |.........9../...|
000000d0  32 38 35 37 30 66 38 38  2d 33 31 65 37 2d 34 31  |28570f88-31e7-41|
000000e0  36 36 2d 62 36 64 37 2d  64 32 32 39 30 33 63 65  |66-b6d7-d22903ce|
000000f0  61 64 37 35 2e 32 38 36  30 39 38 2e 32 39 30 00  |ad75.286098.290.|
00000100  00 00 00 00 00 00 00 00  00 00 00 00 00           |.............|
0000010d

_multipart_aaa.txt.2~3ROegUxcXcW7m_ON_xgp6eJa5VwHP_I.meta
value (284 bytes) :
00000000  08 03 16 01 00 00 39 00  00 00 5f 6d 75 6c 74 69  |......9..._multi|
00000010  70 61 72 74 5f 61 61 61  2e 74 78 74 2e 32 7e 33  |part_aaa.txt.2~3|
00000020  52 4f 65 67 55 78 63 58  63 57 37 6d 5f 4f 4e 5f  |ROegUxcXcW7m_ON_|
00000030  78 67 70 36 65 4a 61 35  56 77 48 50 5f 49 2e 6d  |xgp6eJa5VwHP_I.m|
00000040  65 74 61 cd 03 00 00 00  00 00 00 01 05 03 84 00  |eta.............|
00000050  00 00 03 00 00 00 00 00  00 00 00 84 12 c5 5d 93  |..............].|
00000060  ac 07 3b 00 00 00 00 20  00 00 00 30 42 33 45 35  |..;.... ...0B3E5|
00000070  35 36 31 44 39 44 31 34  36 42 39 42 41 30 37 41  |561D9D146B9BA07A|
00000080  37 33 44 35 42 30 38 42  36 31 41 24 00 00 00 61  |73D5B08B61A$...a|
00000090  70 70 5f 30 42 33 45 35  35 36 31 44 39 44 31 34  |pp_0B3E5561D9D14|
000000a0  36 42 39 42 41 30 37 41  37 33 44 35 42 30 38 42  |6B9BA07A73D5B08B|
000000b0  36 31 41 13 00 00 00 62  69 6e 61 72 79 2f 6f 63  |61A....binary/oc|
000000c0  74 65 74 2d 73 74 72 65  61 6d 00 00 00 00 00 00  |tet-stream......|
000000d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 01 01  |................|
000000e0  04 00 00 00 19 82 cd 03  81 8c 20 00 00 00 5f 6e  |.......... ..._n|
000000f0  44 37 4c 79 79 4e 6c 42  32 4f 73 36 72 57 63 4c  |D7LyyNlB2Os6rWcL|
00000100  5a 69 58 49 79 7a 6d 37  57 2d 35 4f 6c 73 00 00  |ZiXIyzm7W-5Ols..|
00000110  00 00 00 00 00 00 00 00  00 00 00 00              |............|
0000011c

4、data

4.1、RGW POOL

ceph使用各种pool来专门存放特定 类型的数据,pool的命名规则{zone}.rgw.{functions}

# rados lspools
.rgw.root						#保存zone,realm,period相关信息
default.rgw.control				#提供watch-notify机制保证缓存一致性
default.rgw.data.root			
default.rgw.gc					#垃圾回收
default.rgw.log					#记录不同日志
default.rgw.intent-log
default.rgw.usage
default.rgw.users.keys
default.rgw.users.email
default.rgw.users.swift
default.rgw.users.uid			#存放用户uid
default.rgw.buckets.index
default.rgw.buckets.data		#数据池
default.rgw.meta				#元数据池
default.rgw.buckets.non-ec		#multipart upload过程中的临时数据放在这里

4.2、RGW Object

object有一个或多个rados object组成。逻辑上分为head和tail

head组成如下图

tail说明

1.对于小于rgw_max_chunk_size的rgw object没有tail
2.对于multipart上传的对象,tail分multipart(每个part 的第一个rados对象)和shadow
3.tail被分隔成若干个part,每个part默认rgw_obj_stripe_size,详见分片上传
4.2.1、简单上传
(1)整体上传文件
创建bucket
# s3cmd mb s3://steve

制作文件
# dd if=/dev/zero of=./5MB bs=1m count=5

上传文件
# s3cmd put 5MB s3://steve
(2)定位首部
查找bucket marker id(object info包含prefix,manifest,rules等)
# radosgw-admin object stat --bucket=steve --object=5MB | grep marker
                "marker": "28570f88-31e7-4166-b6d7-d22903cead75.286503.1",
            "marker": "28570f88-31e7-4166-b6d7-d22903cead75.286503.1",

确定/查找header对象(可以拼装markid_+object名称),也可以通过data pool查找
如:28570f88-31e7-4166-b6d7-d22903cead75.286503.1_5MB

查看分片对象信息(大小)
# rados -p default.rgw.buckets.data stat 28570f88-31e7-4166-b6d7-d22903cead75.286503.1_5MB
default.rgw.buckets.data/28570f88-31e7-4166-b6d7-d22903cead75.286503.1_5MB mtime 2020-04-04 16:49:57.000000, size 524288

查找header存储信息(pg和osd)
# ceph osd map default.rgw.buckets.data 28570f88-31e7-4166-b6d7-d22903cead75.286503.1_5MB
osdmap e1735 pool 'default.rgw.buckets.data' (23) object '28570f88-31e7-4166-b6d7-d22903cead75.286503.1_5MB' -> pg 23.fa94ccae (23.ae) -> up ([0,4], p0) acting ([0,4], p0)
得到pg为23.ae,主osd为p0(osd.0)

查找osd所在主机
# ceph osd find 0
{
    "osd": 0,
    "ip": "192.168.1.113:6802\/11331",
    "crush_location": {
        "host": "node-1",
        "rack": "rack-01",
        "root": "default"
    }
}

登录相应主机,根据配置查找osd路径
如:/var/lib/ceph/osd/ceph-0/current

通过current进入pg head目录
如:/var/lib/ceph/osd/ceph-0/current/23.ae_head

定位head对象
# ll -h | grep 5MB
-rw-r--r-- 1 root root 512K Apr  4 16:49 28570f88-31e7-4166-b6d7-d22903cead75.286503.1\u5MB__head_FA94CCAE__17
(3)合成object
查询prefix(分块object的前缀)
# radosgw-admin object stat --bucket=steve --object=5MB | grep prefix
        "prefix": ".2NefTZEpClml_iwHAPmCgPLEH6yBUOm_",
                    "override_prefix":

确定分块对象名称
# rados ls -p default.rgw.buckets.data | grep .2NefTZEpClml_iwHAPmCgPLEH6yBUOm_
28570f88-31e7-4166-b6d7-d22903cead75.286503.1__shadow_.2NefTZEpClml_iwHAPmCgPLEH6yBUOm_1
28570f88-31e7-4166-b6d7-d22903cead75.286503.1__shadow_.2NefTZEpClml_iwHAPmCgPLEH6yBUOm_2

确定pg和osd,head目录
如:/var/lib/ceph/osd/ceph-8/current/23.4f_head

定位分片对象(对象存放会转化下划线,grep prefix要注意)
# ll -h | grep 2NefTZEpClml
-rw-r--r-- 1 root root 4.0M Apr  4 16:49 28570f88-31e7-4166-b6d7-d22903cead75.286503.1\u\ushadow\u.2NefTZEpClml\uiwHAPmCgPLEH6yBUOm\u1__head_ECB7124F__17

合成对象(把三个分片复制到同一目录)
# ll | grep 28570f88
-rw-r--r-- 1 root root   524288 Apr  8 11:53 28570f88-31e7-4166-b6d7-d22903cead75.286503.1\u5MB__head_FA94CCAE__17
-rw-r--r-- 1 root root  4194304 Apr  8 12:26 28570f88-31e7-4166-b6d7-d22903cead75.286503.1\u\ushadow\u.2NefTZEpClml\uiwHAPmCgPLEH6yBUOm\u1__head_ECB7124F__17
-rw-r--r-- 1 root root   524288 Apr  8 12:30 28570f88-31e7-4166-b6d7-d22903cead75.286503.1\u\ushadow\u.2NefTZEpClml\uiwHAPmCgPLEH6yBUOm\u2__head_4FA65A70__17

放入head分片
# cat 28570f88-31e7-4166-b6d7-d22903cead75.286503.1\\u5MB__head_FA94CCAE__17 > 5m.manual

放入第二分片
cat 28570f88-31e7-4166-b6d7-d22903cead75.286503.1\\u\\ushadow\\u.2NefTZEpClml\\uiwHAPmCgPLEH6yBUOm\\u1__head_ECB7124F__17 >> 5m.manual

放入第三分片
cat 28570f88-31e7-4166-b6d7-d22903cead75.286503.1\\u\\ushadow\\u.2NefTZEpClml\\uiwHAPmCgPLEH6yBUOm\\u2__head_4FA65A70__17 >> 5m.manual 

对比md5
# md5sum 5MB 
5f363e0e58a95f06cbe9bbc662c5dfb6  5MB
# md5sum 5m.manual 
5f363e0e58a95f06cbe9bbc662c5dfb6  5m.manual
(4)获取对象属性
获取对象属性(存放在head里)
# rados listxattr 28570f88-31e7-4166-b6d7-d22903cead75.286503.1_5MB -p default.rgw.buckets.data
user.rgw.acl						#type RGWAccessControlPolicy 
user.rgw.content_type
user.rgw.etag
user.rgw.idtag
user.rgw.manifest					#type RGWObjManifest 
user.rgw.pg_ver
user.rgw.source_zone
user.rgw.x-amz-content-sha256
user.rgw.x-amz-date
user.rgw.x-amz-meta-s3cmd-attrs
user.rgw.x-amz-storage-class

获取指定属性
# rados getxattr 28570f88-31e7-4166-b6d7-d22903cead75.286503.1_5MB -p default.rgw.buckets.data user.rgw.acl > /tmp/user.rgw.acl.xattr

格式转化
# ceph-dencoder type RGWAccessControlPolicy import /tmp/user.rgw.acl.xattr decode dump_json
{
    "acl": {
        "acl_user_map": [
            {
                "user": "Testuser010",
                "acl": 15
            }
        ],
        "acl_group_map": [],
        "grant_map": [
            {
                "id": "Testuser010",
                "grant": {
                    "type": {
                        "type": 0
                    },
                    "id": "Testuser010",
                    "email": "",
                    "permission": {
                        "flags": 15
                    },
                    "name": "Testuser010",
                    "group": 0
                }
            }
        ]
    },
    "owner": {
        "id": "Testuser010",
        "display_name": "Testuser010"
    }
}

4.2.2、分片上传
(1)自动合成分片
获取多有分片的名称并导入文件中(需要调整分片的顺序)
# cat part-obj 
28570f88-31e7-4166-b6d7-d22903cead75.286503.1_20MB
28570f88-31e7-4166-b6d7-d22903cead75.286503.1__multipart_20MB.2~qM2NidqmbSkaAdP61A-ICWYlMndpmcy.1
28570f88-31e7-4166-b6d7-d22903cead75.286503.1__shadow_20MB.2~qM2NidqmbSkaAdP61A-ICWYlMndpmcy.1_1
28570f88-31e7-4166-b6d7-d22903cead75.286503.1__shadow_20MB.2~qM2NidqmbSkaAdP61A-ICWYlMndpmcy.1_2
28570f88-31e7-4166-b6d7-d22903cead75.286503.1__shadow_20MB.2~qM2NidqmbSkaAdP61A-ICWYlMndpmcy.1_3
28570f88-31e7-4166-b6d7-d22903cead75.286503.1__multipart_20MB.2~qM2NidqmbSkaAdP61A-ICWYlMndpmcy.2
28570f88-31e7-4166-b6d7-d22903cead75.286503.1__shadow_20MB.2~qM2NidqmbSkaAdP61A-ICWYlMndpmcy.2_1

自动合成
# for i in `cat part-obj`;do rados -p default.rgw.buckets.data get $i $i; cat $i >> 20MB.manual;done

对比md5
# md5sum 20MB
8f4e33f3dc3e414ff94e5fb6905cba8c  20MB
# md5sum 20MB.manual 
8f4e33f3dc3e414ff94e5fb6905cba8c  20MB.manual
  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值