引言
快照是Ceph RBD的关键特性,快照的元数据可参考Ceph RBD:快照 + Bluestore/Filestore快照区别一文。
快照id是快照的重要属性,本文介绍快照id的生成策略及其含义。
本文结论
- snap_seq
- 快照id本质上产生于pool上面记录的snap_seq,snap_seq是一个递增的数。
- snap_seq持久化在osdmap中。
- 快照创建时通过monitor获取osdmap中记录的snap_seq。
- snap_seq初始值为1
- snap_seq递增
- rbd创建快照时snap_seq加1。
- rbd删除快照时snap_seq加1。
- 创建第一个image时会在validpool流程创建一个快照并删除,因此用户看到第一个快照的snapid为4。
快照id代码详解
snap_seq
struct pg_pool_t {
snapid_t snap_seq; ///< seq for per-pool snapshot
};
class OSDMap {
mempool::osdmap::map<int64_t,pg_pool_t> pools;
};
创建快照流程
- librbd:创建快照流程,调用librados接口alloc snapid。
- librados:创建快照id,调用objecter接口。
- osd objector:发送消息到monitor获取snapid。
- monitor:获取osdmap,并在snap_seq上递增1后返回snapid。
void SnapshotCreateRequest<I>::send_allocate_snap_id() {
...
image_ctx.data_ctx.aio_selfmanaged_snap_create(&m_snap_id, rados_completion);
...
}
void librados::IoCtxImpl::aio_selfmanaged_snap_create(uint64_t *snapid, AioCompletionImpl *c)
{
...
int r = objecter->allocate_selfmanaged_snap(poolid, &onfinish->snapid,
onfinish);
...
}
int Objecter::allocate_selfmanaged_snap(int64_t pool, snapid_t *psnapid, Context *onfinish)
{
...
PoolOp *op = new PoolOp;
...
op->pool_op = POOL_OP_CREATE_UNMANAGED_SNAP; // optype为POOL_OP_CREATE_UNMANAGED_SNAP
pool_ops[op->tid] = op;
pool_op_submit(op);
return 0;
}
bool OSDMonitor::prepare_pool_op(MonOpRequestRef op)
{
...
case POOL_OP_CREATE_UNMANAGED_SNAP:
{
uint64_t snapid;
pp.add_unmanaged_snap(snapid);
encode(snapid, reply_data);
changed = true;
}
break;
...
}
void pg_pool_t::add_unmanaged_snap(uint64_t& snapid)
{
ceph_assert(!is_pool_snaps_mode());
if (snap_seq == 0) {
// kludge for pre-mimic tracking of pool vs selfmanaged snaps. after
// mimic this field is not decoded but our flag is set; pre-mimic, we
// have a non-empty removed_snaps to signifiy a non-pool-snaps pool.
removed_snaps.insert(snapid_t(1));
snap_seq = 1; // 默认为1
}
flags |= FLAG_SELFMANAGED_SNAPS;
snapid = snap_seq = snap_seq + 1; // 加一后返回
}
删除快照流程
- librbd:创建快照流程,调用librados接口release snapid。
- librados:创建快照id,调用objecter接口。
- osd objector:发送消息到monitor获取snapid。
- monitor:获取osdmap,并在snap_seq上递增1后返回snapid。
template <typename I>
void SnapshotRemoveRequest<I>::release_snap_id() {
...
image_ctx.data_ctx.aio_selfmanaged_snap_remove(m_snap_id, aio_comp);
aio_comp->release();
}
int Objecter::delete_selfmanaged_snap(int64_t pool, snapid_t snap,
Context *onfinish)
{
PoolOp *op = new PoolOp;
if (!op) return -ENOMEM;
op->tid = ++last_tid;
op->pool = pool;
op->onfinish = onfinish;
op->pool_op = POOL_OP_DELETE_UNMANAGED_SNAP;
op->snapid = snap;
pool_ops[op->tid] = op;
pool_op_submit(op);
return 0;
}
bool OSDMonitor::prepare_pool_op(MonOpRequestRef op)
{
...
case POOL_OP_DELETE_UNMANAGED_SNAP:
if (!pp.is_removed_snap(m->snapid)) {
if (m->snapid > pp.get_snap_seq()) {
_pool_op_reply(op, -ENOENT, osdmap.get_epoch());
return false;
}
pp.remove_unmanaged_snap(m->snapid);
pending_inc.new_removed_snaps[m->pool].insert(m->snapid);
changed = true;
}
break;
...
}
void pg_pool_t::remove_unmanaged_snap(snapid_t s)
{
ceph_assert(is_unmanaged_snaps_mode());
removed_snaps.insert(s);
snap_seq = snap_seq + 1;
// try to add in the new seq, just to try to keep the interval_set contiguous
if (!removed_snaps.contains(get_snap_seq())) {
removed_snaps.insert(get_snap_seq());
}
}
创建第一个image
template <typename I>
void ValidatePoolRequest<I>::handle_read_rbd_info(int r)
{
if (r >= 0)
{
bufferlist validated_bl;
validated_bl.append(OVERWRITE_VALIDATED);
bufferlist validate_bl;
validate_bl.append(VALIDATE);
if (m_out_bl.contents_equal(validated_bl)) // 校验是否已经创建过image
{
// already validated pool
finish(0);
return;
}
else if (m_out_bl.contents_equal(validate_bl))
{
// implies snapshot was already successfully created
overwrite_rbd_info();
return;
}
}
create_snapshot(); // 如果没有创建过image需要创建快照,然后再删除快照,用来校验元数据存储是否有问题,元数据需要副本池
}
操作实践
创建删除快照,snap_seq均+1
[root@localhost ~]# ceph osd dump --format=json | python -m json.tool | grep snap_seq
"snap_seq": 13,
[root@localhost ~]# rbd snap create blockpool0/vol_12@vol_12_snap_1
[root@localhost ~]# ceph osd dump --format=json | python -m json.tool | grep snap_seq
"snap_seq": 14,
[root@localhost ~]# rbd snap rm blockpool0/vol_12@vol_12_snap_1
Removing snap: 100% complete...done.
[root@localhost ~]# ceph osd dump --format=json | python -m json.tool | grep snap_seq
"snap_seq": 15,