B+ Tree

Based on“Data Structures and Algorithm Analysis Edition 3.2 (C++ Version)” from C. A. Shaffer

Proporities

A B+ tree can be viewed as a B-tree in which each node contains only keys (not key-value pairs), and to which an additional level is added at the bottom with linked leaves.
The primary value of a B+ tree is in storing data for efficient retrieval in a block-oriented storage context — in particular, file systems. This is primarily because unlike binary search trees, B+ trees have very high fanout (number of pointers to child nodes in a node, typically on the order of 100 or more), which reduces the number of I/O operations required to find an element in the tree.

Operations

We are looking for a value k in the B+ Tree. Starting from the root, we are looking for the leaf which may contain the value k. At each node, we figure out which internal pointer we should follow. An internal B+ Tree node has at most d ≤ b children, where every one of them represents a different sub-interval. We select the corresponding node by searching on the key values of the node.

Insert

  • Perform a search to determine what bucket the new record should go into.
  • If the bucket is not full (at most b - 1 entries after the insertion), add the record.
  • Otherwise, split the bucket.
    • Allocate new leaf and move half the bucket’s elements to the new bucket.
    • Insert the new leaf’s smallest key and address into the parent.
    • If the parent is full, split it too.
    • Repeat until a parent is found that need not split.
  • If the root splits, create a new root which has one key and two pointers. (That is, the value that gets pushed to the new root gets removed from the original node)

Delete

  • Start at root, find leaf L where entry belongs.
  • Remove the entry.
    • If L is at least half-full, done!
    • If L has fewer entries than it should,
      1. Try to re-distribute, borrowing from sibling (adjacent node with same parent as L).
      2. If re-distribution fails, merge L and sibling.
  • If merge occurred, must delete entry (pointing to L or sibling) from parent of L.
  • Merge could propagate to root, decreasing height.

Bulk-loading

  • Given a collection of data records, we want to create a B+ tree index on some key field. One approach is to insert each record into an empty tree. However, it is quite expensive, because each entry requires us to start from the root and go down to the appropriate leaf page. An efficient alternative is to use bulk-loading.
  • The first step is to sort the data entries according to a search key.
  • We allocate an empty page to serve as the root, and insert a pointer to the first page of entries into it.
  • When the root is full, we split the root, and create a new root page.
  • Keep inserting entries to the right most index page just above the leaf level, until all entries are indexed.
  • Note:
    1. when the right-most index page above the leaf level fills up, it is split;
    2. this action may, in turn, cause a split of the right-most index page on step closer to the root;
    3. splits only occur on the right-most path from the root to the leaf level.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值