From the post:
Although the basic idea of binary search is comparatively straightforward, the details can be surprisingly tricky.
– Donald Knuth
Why is binary search so damn hard to get right? Why is it that 90% of programmers are unable to code up a binary search on the spot, even though it’s easily the most intuitive of the standard algorithms?
- Firstly, binary search has a lot of potential for off-by-one errors. Do you do inclusive bounds or exclusive bounds? What’s your break condition: lo=hi+1, lo=hi, or lo=hi-1? Is the midpoint (lo+hi)/2 or (lo+hi)/2 - 1 or (lo+hi)/2 + 1? And what about the comparison, < or ≤? Certain combinations of these work, but it’s easy to pick one that doesn’t.
- Secondly, there are actually two variants of binary search: a lower-bound search and an upper-bound search. Bugs are often caused by a careless programmer accidentally applying a lower-bound search when an upper-bound search was required, or vice versa.
- Finally, binary search is very easy to underestimate and very hard to debug. You’ll get it working on one case, but when you increase the array size by 1 it’ll stop working; you’ll then fix it for this case, but now it won’t work in the original case!
I want to generalise and nail down the binary search, with the goal of introducing a shift in the way the you perceive it. By the end of this post you should be able to code any variant of binary search without hesitation and with complete confidence. But first, back to the start: here is the binary search you were probably taught…
In order to return a result to a user or for use in some other process, you have to find it first. This post may help you do just that, reliably.