B-tree, AVL tree, RB tree

参考:

http://stackoverflow.com/questions/1589556/when-to-choose-rb-tree-b-tree-or-avl-tree/1589587#1589587

http://stackoverflow.com/questions/647537/b-tree-faster-than-avl-or-redblack-tree/1195288#1195288

CRLS 3rd


When to choose RB tree, B-Tree or AVL tree?

Take this with a pinch of salt:

B-tree when you're managing more than thousands of items and you're paging them from a disk or some slow storage medium.

RB tree when you're doing fairly frequent inserts, deletes and retrievals on the tree.

AVL tree when your inserts and deletes are infrequent relative to your retrievals.

Take this with a pinch of salt:

B-tree when you're managing more than thousands of items and you're paging them from a disk or some slow storage medium.

RB tree when you're doing fairly frequent inserts, deletes and retrievals on the tree.

AVL tree when your inserts and deletes are infrequent relative to your retrievals.


B-tree faster than AVL or RedBlack-Tree? 

Sean's post (the currently accepted one) is full of nonsense. Sorry Sean, I don't mean to be rude; I hope I can convince you that my statement is based in fact.

They're totally different in their use cases, so it's not possible to make a comparison.

They're both used for maintaining a set of totally ordered items with fast lookup, insertion and deletion. They have the same interface and the same intention.

RB trees are typically in-memory structures used to provide fast access (ideally O(logN)) to data. [...]

always O(log n)

B-trees are typically disk-based structures, and so are inherently slower than in-memory data.

Nonsense. When you store search trees on disk, you typically use B-trees. That much is true. When you store data on disk, it's slower to access than data in memory. But a red-black tree stored on disk is also slower than a red-black tree stored in memory.

You're comparing apples and oranges here. What is really interesting is a comparison of in-memory B-trees and in-memory red-black trees.

[As an aside: B-trees, as opposed to red-black trees, are theoretically efficient in the I/O-model. I have experimentally tested (and validated) the I/O-model for sorting; I'd expect it to work for B-trees as well.]

B-trees are rarely binary trees, the number of children a node can have is typically a large number.

To be clear, the size range of B-tree nodes is a parameter of the tree (in C++, you may want to use an integer value as a template parameter).

The management of the B-tree structure can be quite complicated when the data changes.

I remember them to be much simpler to understand (and implement) than red-black trees.

B-tree try to minimize the number of disk accesses so that data retrieval is reasonably deterministic.

That much is true.

It's not uncommon to see something like 4 B-tree access necessary to lookup a bit of data in a very database.

Got data?

In most cases I'd say that in-memory RB trees are faster.

Got data?

Because the lookup is binary it's very easy to find something. B-tree can have multiple children per node, so on each node you have to scan the node to look for the appropriate child. This is an O(N) operation.

The size of each node is a fixed parameter, so even if you do a linear scan, it's O(1). If we big-oh over the size of each node, note that you typically keep the array sorted so it's O(log n).

On a RB-tree it'd be O(logN) since you're doing one comparison and then branching.

You're comparing apples and oranges. The O(log n) is because the height of the tree is at most O(log n), just as it is for a B-tree.

Also, unless you play nasty allocation tricks with the red-black trees, it seems reasonable to conjecture that B-trees have better caching behavior (it accesses an array, not pointers strewn about all over the place, and has less allocation overhead increasing memory locality even more), which might help it in the speed race.

I can point to experimental evidence that B-trees (with size parameters 32 and 64, specifically) are very competitive with red-black trees for small sizes, and outperforms it hands down for even moderately large values of n. See http://idlebox.net/2007/stx-btree/stx-btree-0.8.3/doxygen-html/speedtest.html

B-trees are faster. Why? I conjecture that it's due to memory locality, better caching behavior and less pointer chasing (which are, if not the same things, overlapping to some degree).


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值