B-Trees - Balanced Search Trees for Slow Storage

By goodmath on July 6, 2008.

Another cool, but frequently overlooked, data structure in the tree family is called the B-tree. A B-tree is a search tree, very similar to a BST in concept, but optimized differently.

BSTs provide logarithmic time operations, where the performance
is fundamentally bounded by the number of comparisons. B-trees also
provide logarithmic performance with a logarithmic number of
comparisons - but the performance is worse by a constant factor. The
difference is that B-trees are designed around a different tradeoff. The B-tree is designed to minimize the number of tree nodes that need to be examined, even if that comes at the cost of doing significantly more
comparisons.

Why would you design it that way? It's a different performance tradeoff.
The B-tree is a da
ta structure designed not for use in memory, but instead for
use as a structure in hard storage, like a disk drive. B-trees are the
basic structure underlying most filesystems and databases: they provide
an efficient way of provide rapidly searchable stored structures. But
retrieving nodes from disk is very expensive. In comparison to
retrieving from disk, doing comparisons is very nearly free. So the design
goal for performance in a B-tree tries to minimize disk access; and when disk access is necessary, it tries to localize it as much as possible - to minimize the number of retrievals, and even more importantly, to minimize the number of nodes on disk that need to be updated when something is inserted.

An order-N B-tree is a search tree with the following properties:

each node contains no more than N children.
Every node (except the root and the leaves) contains at least
N/2 children.
All leaves are empty nodes, and occur in the same level of the tree.
Every node with N children contains N-1 keys.
Given a node N₁, with a K'th child node N_K, the keys in
N_K are greater than the K-1th key in N₁, and less that
the Kth key in N₁.

The brilliance of the B-tree is that it's a structure that minimizes access to disk while maintaining a nearly perfect balance. It's really an amazingly beautiful and elegant structure.

The key to it is the beautiful insert operation. The tree grows from the leaves - but it only adds levels at the root. That makes the balancing work - and work in a way that guarantees that you'll be able to keep the tree in balance doing nothing but modifying nodes in a single linear chain from the leaf up to the root.

To insert a value, you start by picking the correct place to insert. The set of non-empty nodes at the bottom of the tree is guaranteed to be a sorted list of values,
running from the left-most node over to the right, in order.

You find its place in that list - and that gives you the insertion node. If the
insertion node already has N values, that you split the node. You take the
median value in the node, and use it as a pivot. You take the nodes smaller than the
pivot, and turn them into a new node named "L"; and the nodes larger that the pivot,
and make them into a new node named "R". Then you move the pivot into the parent
of the insert node, and make L its left child, and R its right. So now the former leaf node is now two nodes, with one of its values promoted up into its parent.

If the parent node already had N values, then you split it, and so on. You keep bumping up the tree, until you get to the root - if you need to split the root, then you take the split pivot node, and turn it into a new root with only one value. (That's why the second invariant makes an exception for the root node.)

As usual, examples help to understand. Look at the order-3 B-tree in the figure below.

In an order-4 B-tree, every node (except the root) must have at least two values. Suppose we wanted to insert the value "20". That would go in the middle child node, between 18 and 20. But that would result in a new node with 4 values. But the maximum
number of values per node is three. So we need to split it. We'll pick 18 as the pivot. That will give us two new nodes - one with only one value (15), and one with two (20, 21). The pivot node, 18, will be inserted into the parent, giving us a parent with the three nodes 10, 18, 30, and four children. The resulting B-tree is shown below.

i-9eac5df0a7f75ba340b326c7d5721544-btree-one-insert.png

Now, suppose we wanted to insert the value 7. That would split the leftmost child. Using 8 as a pivot, we'd get two new nodes: (3, 7) and (9), and we'd insert 8 into the root node. But that would result in the root node having four values. So we split the root node. The root pivot is 18, so we get two new nodes, (8, 10) and (30), and
a new root, containing only (18), as shown below.

i-ffeebe002808a60699bde39db5f951fe-btree-second-insert.png

Deletes are very similar, except that they involve contractions. When a node shrinks to be too small because of deletions, it gets merged with a neighbor; after the contraction, if the resulting merged node has too many values (for example, if the neighboring node was full before the merge), then you split the newly merged node. So either you end up removing a node from the tree, or you redistributing the children between the two nodes in a balanced manner.

As usual for my data-structure posts, there'll be an example implementation in another post, to follow as soon as I have time to write it.

More like this

Anyone who enjoys a bit of data structure coding fun should implement a B-tree, and possibly one of its kin, such as a B+-tree or a B*-tree.

Deletes are very similar, except that they involve contractions.

This is a piece of fiction that we tell to undergraduates.
In fact, we never tell people the full B-tree deletion algorithm. The reason is that it's nowhere near similar to the insertion algorithm, and full of special cases, and generally horrid. Try implementing it and you'll see what I mean.
Remember, the goal is to minimise disk operations and, if possible, to maintain some degree of concurrency. It's therefore important to avoid operations that change the structure of the tree (splitting or merging nodes) where possible. So, for example, on delete, it's almost always best to shuffle keys between leaves if you can get away with it.
One more thing. In a true concurrency situation, it's almost always better to occasionally violate the structure restrictions on any kind of trees, B-Trees included, in the interests of each operation completing in a reasonable amount of time. It turns out that this has additional benefits, in that both insertion and deletion algorithms become much simpler to implement.

This article might be easier to understand if you started with the same example you finished with. :-)

Indeed, I agree with #3. I think something went a little wrong when selecting the initial example image...

In your first diagram the in middle child node has the value sequence 11,18,21,27, but in your second you suddenly introduce "15", having inserted 20. Am I missing something?

He switched 11 to 15. Understandable, I suppose, as the both begin with 1...

In any case, pick one or the other in you mind (until he fixes it) and the example makes sense.

In case any of you noticed it, the "web design" comment posted at 3:29 is actually comment spam. It's a reproduction of a comment on Reddit, from a spammer who's hit this site before.

Mark,

Are these the same B-trees I used to use several decades ago in lieu of a "database"?

There's something about m-ary...

If I may cut and paste from a Weisstein-Wolfram page:

B-trees were introduced by Bayer (1972) and McCreight. They are a special m-ary balanced tree used in databases because their structure allows records to be inserted, deleted, and retrieved with guaranteed worst-case performance. An n-node B-tree has height O(lg n), where lg is the logarithm to base 2. The AppleÂ® MacintoshÂ® (Apple, Inc., Cupertino, CA) HFS filing system uses B-trees to store disk directories (Benedict 1995). A B-tree satisfies the following properties:

1. The root is either a tree leaf or has at least two children.

2. Each node (except the root and tree leaves has between [m/2] and m children, where [x] is the ceiling function.

3. Each path from the root to a tree leaf has the same length.

Every 2-3 tree is a B-tree of order 3. The number of B-trees of order 3 with n=1, 2, ... leaves are 1, 1, 1, 1, 2, 2, 3, 4, 5, 8, 14, 23, 32, 43, 63, ... (Ruskey, Sloane's A014535). The number of order-4 B-trees with n=1, 2, ... leaves are 1, 1, 1, 2, 2, 4, 5, 9, 15, 28, 45, ... (Sloane's A037026).

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

Moving on

August 2, 2010

Finally, at long last, I can tell you what I've been up to with finding a new home for this blog. I've created a new, community-based science blogging site, called Scientopia. With the help of many wonderful people, we're ready. We launched this morning. So to continue following GM/BM - along with…

Goodbye, Scienceblogs

July 7, 2010

So my decision is made. I'm closing up around here. I'm in the process of working out exactly where I'm going to go. With any luck, Seed will leave this blog here long enough for me to post an update with the new location. But I'm through with Seed and ScienceBlogs.

Seed, Conflicts of Interest, and Sleaze

July 6, 2010

As my friend Pal wrote about, Seed Media Group, the corporate overlords of the ScienceBlogs network that this blog belongs to, have apparently decided that blog space in these parts is now up for sale to advertisers. We've been advertiser supported since I joined up with SB. I've never minded…

Searching for Topics

June 28, 2010

As regular readers have no doubt noticed by now, posting on the blog has been slow lately. I've been trying to come back up to speed, but so far, that's been mainly in the form of bad math posts. I'd like to get back to the good stuff. Unfortunately, the chaos theory stuff that I was…

Saturday Recipe: Ginger Scallion Sauce

June 26, 2010

Today's recipe is something I made this week for the first time, and trying it was like a revelation. It's simple to make, it's got an absolutely spectacularly wonderful flavor - light and fresh - and it's incredibly versatile. It's damned near perfect. It's scallion ginger sauce, and once you try…