Berkeley DB 源代码分析 (3) --- Btree的实现 (2)

 


__bam_ditem

In btree we store on-page duplicate key/data pairs this way:
1. we only put the key onto the page once, since it's duplicated, there is no meaning putting
identical keys multiple times. and we put each of the dup keys' data items
onto the page;

2. In the index array, there are multiple index element for this
dup key pointing at the same key's offset. and since the index array is sorted by
the keys the elements point at, index element to the same dup keys are continuous, like
indx[i], indx[i+1] and indx[i+2] point at the same key value on the page who
has 3 dup key/data pairs.

so when deleting the key indx[i+1], we don't remove the key from page since there are
still indx[i] and indx[i+2] pointing at the key. we simply move elements after
indx[i+1] one element forward, and then we will have indx[i] and indx[i+1]
pointing at that key, and thus we will have two dup key/data pairs. When
deleting a key/data pair of btree leaf page, we do it twice, first delete the
key then delete the data item -- the order can't be reversed.


Deleting key/data pairs

1. In DBC->del, we only mark the key/data pair deleted (B_DELETE), and mark
the cursor to be pointing to a deleted k/d pair(C_DELETED), but we don't
effectively remove the k/d from page, unless the cursor is closed and it's
pointing to a deleted k/d. In this special case we will remove the single k/d
pair it points to. After a data item is marked deleted, it can be internally
found/located by search functions, but never returned to user. The space it
takes can be overwritten, when inserting a k/d which should be located at
exactly the same page and location.

Thus, if we use DB->del to delete a k/d, it's immediately deleted from db; if
we use DBC->del to iterate the db and del each k/d, none except the last one
is removed from db. This can avoid frequent tree structure change
(split/rsplit), which are expensive operations, but also waste a lot of space
potentially.

I think we should add a DB_FORCE flag for DBC->close and when it's specified
we know no other cursor is pointing on the k/d, thus when our cursor is about
to move away from current page to another page, we delete all k/d pairs marked
B_DELETE. We don't remove on each DBC->del call because it would make the
cursor movement operations harder to implement.


__db_ret
Return a specified key/data pair, given the page pointer(which was locked and
fetched from mpool already), pgno and index.


__bam_getboth_finddatum

works for DB_GET_BOTH and DB_GET_BOTH_RANGE flags in DB/DBC->get.

If DB_DUP is set but DB_DUPSORT is not set, in which case dbp->dup_compare is
null, we do a linear search, and only look for exact match even RANGE is
specified, i.e. RANGE is identical to GET_BOTH if not DUPSORT, which is
undocumented.

Otherwise both DB_DUP and DB_DUPSORT are specified, and we do a binary search
on the leaf page. __bamc_search does the btree search in the opd, not this
one.

__bamc_put

In btree we can't specify where to store a k/d because its stored according to
k's value and d's value. The only exception to this rule is when the btree
allows dup(DB_DUP set) but doesn't allow sorted dup(!DB_DUPSORT), and in this
case we can specify to insert a data item before or after(DB_BEFORE/DB_AFTER)
the cursor's current pointed key/data pair as a dup data item for the same
key.

Other flags like DB_OVERWRITE_DUP, DB_NODUPDATA and DB_NOOVERWRITE all
controls how to deal with dup data items rather than control movement or pos
o

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值