B-tree, AVL tree, RB tree

最新推荐文章于 2020-08-15 17:00:50 发布

lihenair

最新推荐文章于 2020-08-15 17:00:50 发布

阅读量1.7k

点赞数

分类专栏： Algorithm

Algorithm 专栏收录该内容

37 篇文章 0 订阅

订阅专栏

参考：

http://stackoverflow.com/questions/1589556/when-to-choose-rb-tree-b-tree-or-avl-tree/1589587#1589587

http://stackoverflow.com/questions/647537/b-tree-faster-than-avl-or-redblack-tree/1195288#1195288

CRLS 3rd

When to choose RB tree, B-Tree or AVL tree?

Take this with a pinch of salt:

B-tree when you're managing more than thousands of items and you're paging them from a disk or some slow storage medium.

RB tree when you're doing fairly frequent inserts, deletes and retrievals on the tree.

AVL tree when your inserts and deletes are infrequent relative to your retrievals.

Take this with a pinch of salt:

B-tree when you're managing more than thousands of items and you're paging them from a disk or some slow storage medium.

RB tree when you're doing fairly frequent inserts, deletes and retrievals on the tree.

AVL tree when your inserts and deletes are infrequent relative to your retrievals.

B-tree faster than AVL or RedBlack-Tree?

Sean's post (the currently accepted one) is full of nonsense. Sorry Sean, I don't mean to be rude; I hope I can convince you that my statement is based in fact.

They're totally different in their use cases, so it's not possible to make a comparison.

They're both used for maintaining a set of totally ordered items with fast lookup, insertion and deletion. They have the same interface and the same intention.

RB trees are typically in-memory structures used to provide fast access (ideally O(logN)) to data. [...]

always O(log n)

B-trees are typically disk-based structures, and so are inherently slower than in-memory data.

Nonsense. When you store search trees on disk, you typically use B-trees. That much is true. When you store data on disk, it's slower to access than data in memory. But a red-black tree stored on disk is also slower than a red-black tree stored in memory.

You're comparing apples and oranges here. What is really interesting is a comparison of in-memory B-trees and in-memory red-black trees.

[As an aside: B-trees, as opposed to red-black trees, are theoretically efficient in the I/O-model. I have experimentally tested (and validated) the I/O-model for sorting; I'd expect it to work for B-trees as well.]

B-trees are rarely binary trees, the number of children a node can have is typically a large number.

To be clear, the size range of B-tree nodes is a parameter of the tree (in C++, you may want to use an integer value as a template parameter).

The management of the B-tree structure can be quite complicated when the data changes.

I remember them to be much simpler to understand (and implement) than red-black trees.

B-tree try to minimize the number of disk accesses so that data retrieval is reasonably deterministic.

That much is true.

It's not uncommon to see something like 4 B-tree access necessary to lookup a bit of data in a very database.

Got data?

In most cases I'd say that in-memory RB trees are faster.

Got data?

Because the lookup is binary it's very easy to find something. B-tree can have multiple children per node, so on each node you have to scan the node to look for the appropriate child. This is an O(N) operation.

The size of each node is a fixed parameter, so even if you do a linear scan, it's O(1). If we big-oh over the size of each node, note that you typically keep the array sorted so it's O(log n).

On a RB-tree it'd be O(logN) since you're doing one comparison and then branching.

You're comparing apples and oranges. The O(log n) is because the height of the tree is at most O(log n), just as it is for a B-tree.

Also, unless you play nasty allocation tricks with the red-black trees, it seems reasonable to conjecture that B-trees have better caching behavior (it accesses an array, not pointers strewn about all over the place, and has less allocation overhead increasing memory locality even more), which might help it in the speed race.

I can point to experimental evidence that B-trees (with size parameters 32 and 64, specifically) are very competitive with red-black trees for small sizes, and outperforms it hands down for even moderately large values of n. See http://idlebox.net/2007/stx-btree/stx-btree-0.8.3/doxygen-html/speedtest.html

B-trees are faster. Why? I conjecture that it's due to memory locality, better caching behavior and less pointer chasing (which are, if not the same things, overlapping to some degree).