Finger Tree Update: I forgot something

By goodmath on May 28, 2009.

As an alert commenter pointed out, I left out one really important thing in
my earlier post about finger trees. That's what I get for trying to write when
I'm sick :-). Seriously, this is a point that's implied by the post as it stands, but never explicitly stated - and since it's really important, it
should be stated explicitly.

The monoid-annotated tree structure doesn't replace the original
data structure: it's superimposed on it.

So, as I said: cons-cell style lists are ubiquitous and beloved by
functional and non-functional programmers. In finger trees, you're
not getting rid of them. The point of finger trees is to let
you keep the convenient data structure, with its basic operations
and properties intact, and to augment it with a tree that lets you search it efficiently.

To illustrate: here's the number list example from yesterdays post. The
original list is at the bottom, with green pointers representing the
original cons-list next-pointers. The monoid-annotated tree is on top,
with red pointers. The combination of the original list with the
monoid annotated tree is a finger tree.

i-0bdf4c8399a6b15e13cbbc9c34bdee8e-counted-finger-tree-list.png

The point of this is that you've still got your cons list. All of the
beautiful recursive/iterative algorithms that walk the list continue to
work exactly the way that they would in the traditional cons-list: for code that walks the list, the fact that there are finger-tree pointers
sitting on top of the list is irrelevant - and, in fact, completely invisible. For algorithms that want to search the list, the tree structure is there,
and allows the searches to be performed much more quickly than they could be on
the traditional list. The superposition of those two structures is the
genius of the finger tree.

More like this

Comparing with skip lists?

But this breaks the "quick updating" property in functional languages -- you have to rebuild the entire left side of whatever update you do.

Re #2:

As in pretty much everything in data structures, there's a tradeoff. In this case, you need to add the cost of maintaining the tree
to the inserts - making inserts O(lg n) in the worst case (although there are workarounds to improve head and tail inserts in the average case). But in exchange for the increase in cost of inserts, you get a vastly decreased search cost. As always when you're programming, you need to understand your application, and figure out what set of tradeoffs will work best. There are definitely applications where using a finger-tree is absolutely the wrong thing. But for other applications, it's amazing. (For a real tour-de-force of beautiful data structure stuff, try looking at the Yi editor written in Haskell. They implement their buffers using finger-tree-ropes.

Er, this still isn't a finger tree.

The 'fingers' in a finger tree are the left and right most nodes in your 2-3 tree. You grab the left most and right most node of the tree and hold onto them, while you let the center of the tree go, so it 'dangles' down yielding a bat-wing like structure, where you've inverted the pointers that previously had pointed down to your leaves to instead point 'down' towards the center of your tree!

By holding onto the two fingers you can quickly access the leftmost and rightmost edge of your fingertree, but you pay more to access the 'body' of your 'bat', the former root node which you now have to find by descending down the 'v' shaped spine of the tree. Note that you can get there by climbing down from either the left or right finger. This is important because it enables fingertree concatenation to be done by having your two finger trees 'hold hands' and then just 'pushing down' on the hand in the middle until you get that nice v shape again.

Your previous monoidally annotated 2-3 tree that you are using lacks that property completely. It is a useful structure, and its related to a finger tree, but it has no fingers! This structure is also a different beast. You do not maintain both the tree and some list structure, a finger tree is symmetric.

The tradeoff for insertions is rather low, especially when you consider that insertion is done once, but access is normally done _atleast_ once.

In my previous description I said the shape of a finger-tree is like a bat-wing structure. The common implementation in Haskell used by the Hinze/Patterson paper stylizes this by using non-uniform recursion.

If we take a Haskell style finger-trees with monoidal result v and entries of type a

I'll fix the monoid to be the 'size' monoid which just adds up the element count, mapping each input to 1, as is used inside Data.Sequence.

The empty fingertree is just

Empty

As we add elements a and b we quickly get to where we have two 'fingers' that we can access quickly.

Single a

Deep 2 (One a) Empty (One b)

As we add more elements (c and d) they can be added to the digits on the left or the right

Deep 3 (Two a b) Empty (One c) -- or -- Deep 3 (One a) Empty (Two b c)
Deep 4 (Two a b) Empty (Two c d) -- or -- Deep 4 (One a) Empty (Three b c d) -- or -- Deep 4 (Three a b c) Empty (One d)

but eventually they fill up. Once there are enough of them you can open another node through the link in the Deep node, and the non-uniform recursion adds a 'Node' level to everything hanging off a node this far 'up' towards the original root, so once we add 'e' it could go in as:

Deep 5 (Four a b c d) Empty (One e) -- or -- Deep 5 (One a) (Single (Node3 3 (One b) (One c) (One d)) (One e) -- etc...

But once we exceed 8 elements, we HAVE to start using nested nodes, because can only have 8 'fingers' we can represent with the two 'Digit' slots available to us.

Deep 12 (Three a b c) (Single (Node3 6 (Two d e) (Two f g) (Two h i))) (Three j k l)

the monoid is cached in the Deep entries in our FingerTree, and in the Nodes we use for 2-3 trees of a known depth, which hang off of the sides of our trees, but we don't bother to do so for the digits.

The tricky thing that you get out of this structure is that the non-uniform recursion enforces that the height of both of the trees that dangle off a deep node is the same. Digits allow you the flexibility needed to efficiently push or pop entries on either end like a deque, uniformly from either side.

The top level 'deep' node here represents the two fingers, where we've 'grabbed our binary tree and let go of the root. The nested deep node is the body of our 'bat' hanging from those fingers, since the two nodes closer to the original root of our tree from the fingers were 'higher' up a balanced 2-3 tree they should have bigger trees hanging off of them, containing their other children.

I recently gave a class presentation on 2-3 finger trees, and Edward
Kmett is right.

If you want an explanation of finger tree structure that is not based on
"grabbing the ends" or algebraic data types, I still have my slides:

http://www2.hawaii.edu/~jaress/621/ass03/ft.slides.pdf

Skip to slide 13 for 2-3 finger trees; there's a bunch of background
about persistent data structures in the beginning.

I forgot to mention that I was drawing things on the chalkboard next to the slides, so if you want an easy way to draw the tree:

The top level is a horizontal line with a dot in the middle and one to four of the items (letters, we'll say) sitting on each side of the dot. So, up to eight letters total sitting on the line.

If you want a second level, bring a vertical line down from the dot to another dot in the middle of another horizontal line. Sitting on that line are not letters, but circles with two or three lines angling down from them with a letter at the end of each of these angled lines.

If you want a third level, the circles angle down to circles that angle down to letters. In other words, the third horizontal line has roots of 2-3 trees of height two sitting on it.

And so forth, though I'm sure you'll want to stop there. (Sorry, I'm bad at drawing on a computer.)

If you want to annotate the tree, the annotations can go in the circles.

Just to echo Rui Ferrara. Skip Lists also address this area in a simple and if implemented appropriately, very cache friendly manner. People I think get put off by their probabilistic nature, but really as the list grows and the time matters more the efficiency gets better.

@Markk: Actually skip lists don't address quite the same use case. Finger trees provide O(1) access to both ends. You still have a logarithmic access time to the far end of the skip list.

I recommend reading http://www.dougalstanton.net/blog/index.php/2008/12/12/a-brief-look-at-… as well. The OP's blog posts are inaccurate, as other commenters have pointed out.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

Moving on

August 2, 2010

Finally, at long last, I can tell you what I've been up to with finding a new home for this blog. I've created a new, community-based science blogging site, called Scientopia. With the help of many wonderful people, we're ready. We launched this morning. So to continue following GM/BM - along with…

Goodbye, Scienceblogs

July 7, 2010

So my decision is made. I'm closing up around here. I'm in the process of working out exactly where I'm going to go. With any luck, Seed will leave this blog here long enough for me to post an update with the new location. But I'm through with Seed and ScienceBlogs.

Seed, Conflicts of Interest, and Sleaze

July 6, 2010

As my friend Pal wrote about, Seed Media Group, the corporate overlords of the ScienceBlogs network that this blog belongs to, have apparently decided that blog space in these parts is now up for sale to advertisers. We've been advertiser supported since I joined up with SB. I've never minded…

Searching for Topics

June 28, 2010

As regular readers have no doubt noticed by now, posting on the blog has been slow lately. I've been trying to come back up to speed, but so far, that's been mainly in the form of bad math posts. I'd like to get back to the good stuff. Unfortunately, the chaos theory stuff that I was…

Saturday Recipe: Ginger Scallion Sauce

June 26, 2010

Today's recipe is something I made this week for the first time, and trying it was like a revelation. It's simple to make, it's got an absolutely spectacularly wonderful flavor - light and fresh - and it's incredibly versatile. It's damned near perfect. It's scallion ginger sauce, and once you try…