Random Heap Updates Are Cheap
A while ago I encountered an algorithmic challenge at work. Basically, the idea is that we have a bag of numbers, and we’d like to be able to update each number as well as insert and remove, and also occasionally pop the smallest number. All of these are simple and typical heap operations. But in our use case, we’re going to be updating numbers much more frequently than popping the smallest number. Recall from your data structure classes that removing from a heap costs O(log n), and updating is just removing followed by inserting, so logically if we make n updates followed by one pop min, we’re going to pay O(n log n).
Consider an alternative approach, where we put all numbers in an unordered array. To update, we just overwrite the old number, and to pop min we scan the array. Then, if we make n updates followed by one pop min, the total cost is now just O(n). The problem with that is that the worst case could grow to O(n2) when we pop min a lot more than expected in production. Hence the question: is there a way to do roughly O(1) work per update, but still end up with O(log n) worst case for pop min?
In general, this is impossible. No heaps support O(1) update because update is strictly harder than pop min, due to the fact that updating the min element to infinity achieves the same effect as popping. Perhaps we can further relax the requirements to make progress. One way to do so is to assume that the updates are “random”.
It’s not entirely clear what the definition of random updates ought to be. To start, one reasonable definition would be that for each update, (A) an existing element is chosen uniformly at random, and (B) its updated rank is also independently chosen uniformly at random.
When I got to this point, I dived in and devised some complicated data structures which achieved the desired behaviors. But I later figured out that in fact the existing data structures I knew already satisfy the above requirements. Let’s take a look.
Binary Heap
The simplest heap in existence is the binary heap, where we have a binary tree embedded in an array. The element at index i has children at indices 2i+1 and 2i+2, and we maintain the heap property that a child must be no less than its parent. To update an element in the heap, we can just overwrite the old element in the array, and simply recursively swap elements until the heap property holds. The time complexity of update is just how many swaps we need. In the worst case, we need to make O(log n) swaps, e.g. when the min element is updated to become the max.
What about the “average” case given our assumptions of randomness? First, for updates that increase an element, we have to swap it with its children recursively. The worst case is that we have to swap it all the way down. In that case, assuming the element is randomly picked, the expected number of swaps is roughly:
0 * (1/2) + 1 * (1/4) + 2 * (1/8) + 3 * (1/16) + ...
= (1/4 + 1/8 + 1/16 + ...) + (1/8 + 1/16 + ...) + (1/16 + ...) + ...
= 1/2 + 1/4 + 1/8 + ...
= 1
This is because roughly half the elements are already at the bottom so they never need to be swapped down, then the remaining half are one level up, and so on.
Then, for updates that decrease the element, it takes some reasoning to see that it’s symmetric with the previous case. Say in heap H1, we’re decreasing an element at rank R1 to rank R2. After that’s done, we have H2, and if we were to change the rank back to R1, we actually have to do the exact same swaps to move it back to its original position (this might not be very obvious, but you can work out an example to convince yourself). Now, we claim that H1 and H2 are be equally probable configurations, since the probability distribution from which we drew H1 should be invariant through random updates. Hence, the expected number of swaps needed to decrease a rank is the same as that to increase a rank, which is 1. (By the way, I feel like there ought to be a better argument. This argument relies on H1 and H2 being in the same probability distribution, which might not hold when other heap operations are carried out.)
All in all, randomly updating in place for a binary heap is actually O(1). In other words, binary heaps support O(1) random updates and O(log n) worst case for everything, which is exactly what we desire.
Pairing Heap
That’s great, except that I only had access to a pairing heap implementation. Pairing heap is this cool data structure where we have a tree (no limit on number of children per node) that lazily rebalances itself on pop min.
Here’s an extremely simplified description. We start with a tree with (only) the heap property. To “meld” (combine) two trees, we just take the tree with the smaller root, and stick the other tree under that root as an immediate child. Inserting an element is melding with a tree of size 1. To pop min, we first remove the root, and now we have to merge a whole bunch of trees, which were the immediate children of the root. The naive way of melding all of them in one go will result in a bad time complexity, since we might have to go through all of them again for the next pop min. The trick is to first meld the trees in pairs, then meld all those results in (reverse) order. This cuts down the number of immediate children for the next round by at least half. Lastly, removing any given node is just: cut it out from its parent, pop min from the detached branch, then meld the rest of it back.
The exact time complexity of all operations of pairing heap is still an open problem, but for our purposes, let’s just say insert takes O(1), and removing any node has amortized worst case O(log n). The naive way to update an element would be to remove the old value and then insert the new value. To remove a randomly picked element, the expected amount of work is proportional to the expected number of children, which is less than 1. Insert is also O(1), so in total, a random update is O(1). Note that this analysis only assumes (A).
Again, we get what we want: O(1) for random updates, O(log n) for amortized worst case pop min and updates.
Others?
While I was figuring this out, I learned that there are quite a variety of these data structures out there. Fibonacci heap used to be the poster child of being theoretically great but not practical, but these days we have rank pairing heap that achieves the same asymptotic bounds and claims to be competitive in practice as well. Aside, there are a bunch of variants of pairing heap. I’m not sure whether all these different heaps have similar properties as discussed here, but at this point I don’t care enough to find out, since most of these heaps are probably never used in real life anyway.