• Comparing binary search against sequential search. Source: Johnson 2015.
• Logarithmic function grows a lot slower than linear function. Source: Popov 2015.
• Illustration of binary search tree. Source: Johnson 2015.

Binary Search

jeetsingh
1631 DevCoins

arvindpdmn
33 DevCoins
2 authors have contributed to this article
Last updated by arvindpdmn
on 2020-01-06 09:33:15
Created by arvindpdmn
on 2017-03-17 07:17:04
Improve this article. Show messages

Summary

Binary search is an algorithm for efficiently searching for an item within a sorted list of items. The search is performed in steps, with each step reducing the search space by half (with linear or sequential search the search space reduces by only one item). At each step, the midpoint is selected and compared against the search value. If the midpoint is lesser than the search value, only the upper half is retained for the next step.

When formulated in a more abstract way, binary search can be used to solve optimization problems with the use of binary predicates. Whenever the predicate is evaluated, the algorithm will retain only half the search space where the predicate is true. In this case, binary search can be seen as "searching" for an optimum solution rather than searching for a specific item.

Milestones

1946

John Mauchly mentions binary search in what is probably the first published reference.

1960

D.H.Lehmer publishes a method that can handle any $$N$$, not just $$N=2^{n}-1$$.

1962

H.Bottenbruch changes the algorithm so that comparisons are reduced at the expense of an extra iteration.

1971

A.K.Chandra of Stanford University suggests the uniform binary search.

Discussion

• How can we derive the complexity of binary search?

Given a list of $$n$$ items, linear search is of complexity $$O(n)$$. Binary search has a complexity of only $$O(log_{2}n)$$. For comparison, linear search on a list of 1000 items may take on average 500 steps and 1000 steps in the worst case. The same with binary search will only take about $$log_{2}1000\approx10$$ steps.

A video tutorial by GeeksQuiz explains how to derive this complexity. Briefly, since each step reduces the search space by half, and given that $$k$$ steps are required, this implies $$n=2^k$$, which translates to $$k=log_{2}n$$. Binary search is said to take the divide-and-conquer approach.

Computational complexity described above is in terms of number of steps to complete the search. An optimized implementation would use only one comparison per step. Algorithms also take up memory to manage operations. The space complexity of binary search is $$O(1)$$, which means that it is constant and independent of $$n$$.

• What are the prerequisites for applying binary search?

Binary search can be applied only under the following conditions:

• List is sorted: Since half the search space is discarded based on a midpoint value comparison, binary search works only on a sorted list.
• Random access is possible: Binary search maintains the boundaries of the search space in terms of left and right indices. Midpoint is calculated based on these indices. To access the value of the current midpoint item, the item must be accessible by its index. This works nicely with data structures that support indexing such as arrays (C, Java) and lists (Python). Using binary search on a data structures such as a linked list is wasteful since we need to traverse the list item by item to get to the midpoint.
• Size: Size of the search space must be known.
• If my array is unsorted, what search should I use?

There are two approaches: sort the array first and then apply binary search; or simply apply linear search. The question is which of these two is faster. Let's use a fast sorting algorithm such as mergesort. This has a complexity of $$O(n\,log_{2}n)$$. Linear search has a complexity of $$O(n)$$. Since $$O(n\,log_{2}n) > O(n)$$, to use linear search is better.

This also implies that if your application does frequent updates to the list, avoid binary search since keeping the list in sorted order can reduce performance.

• Can we do a 3-way or even n-ary search for faster results?

Every comparison is between two alternatives, hence binary in nature. To decide among three alternatives we would need two comparisons anyway. A binary search is therefore superior since with two comparisons it can handle $$2^{2}=4$$ alternatives. The same argument works for n-ary searches. An alternative explanation is available from Geeks For Geeks.

When the distribution of data is known and the lists are large, then interpolation search can do better than binary search. Hash tables are faster because they maintain a mapping of key-to-value. However, binary search is more versatile since they can be used to solve optimization problems as well.

• Can you give an example of binary search for solving optimization problems?

TopCoder gives an example called FairWorkload. There are $$n$$ filing cabinets each with a variable number of folders and $$k$$ available workers to process the folders. Each worker must be assigned one or more contiguous filing cabinets to process. Under the constraint of worker availability, how many workers do we need so that workload is most evenly distributed while minimizing the number of folders each worker processes? This can be solved using binary search. The search space is the number of folders per worker.

The predicate can be stated as, "Can the workload be spread so that each worker has to examine no more than x folders, with the limited number of workers available?" With each step, we can calculate the number of workers required for a certain maximum number of files per worker. If this number is less than or equal to $$k$$, it implies that we can move the upper limit to the current midpoint.

Another example is to find the minimum of a mathematical function. Another example is to find the square root of a real number.

• How is binary search different from binary search tree?

Binary search is an algorithm. Binary search tree is a data structure. Other than this fundamental difference, both share the same principle in terms of searching for an item. Search complexity is logarithmic in time. With binary search tree, each node in the tree has a left and a right branch for items of lesser and greater value respectively. This means that when the tree is searched, half the search space is discarded either to the left or right. However, a tree that's not balanced can result in worse performance. Even a balanced binary search tree is not necessarily optimal for searching compared to binary search.

Unlike binary search, binary search trees can't be used to solve optimization problems. Binary search trees also take up more space compared to sorted arrays. Binary search trees have the advantage that insertion and deletion of items are also logarithmic in time, unlike sorted arrays where such operations are of complexity $$O(n)$$.

• Should we implement binary search iteratively or recursively?

For algorithms in general, there's no general rule that one is better than the other. Sandeep Gupta shows an example where recursive implementation is easier to read.

With respect to binary search, for large arrays, recursive code can lead to stack overflows. Likewise, recursive code can be slower due to the extra function calls. However, if the code can be refactored to tail recursive, speed and stack problems can be overcome with tail-call optimization.

Which one to adopt might ultimately be a matter of personal choice for developers. Adopt one that's easier for you to understand. There's nothing worse that a recursive code that doesn't have a proper terminating condition and hence goes into infinite recursion.

• What are the variations of binary search?

Variations include the following:

• Uniform binary search
• Boundary search
• Fibonacci search
• Exponential search
• Interpolation search
• Fractional cascading
• What are the usual challenges for implementing binary search?

Typical binary search maintains the left and right indices of the search space and calculates the midpoint. For the next step, these indices and midpoints are adjusted. If one is not careful, the search item may be on the boundary and may get missed out due to "off-by-one" error. Don't overconstrain the bounds. Relax them while adhering to the predicate. Also, when calculating the midpoint, take care that overflow and truncation errors are avoided or handled correctly.

Always test for corner cases. Will the code work for odd and even number of list items? Will it work for 1-item or 2-item lists? How will it behave if there are duplicate entries of the search term? Will it exit cleanly if the search term is not found? Does it work correctly when searching for real numbers? With real numbers, match of the search term should be based on either the number of steps or the relative difference between steps. Avoid using absolute difference since precision is limited when numbers are large.

If a recursive implementation is used, be aware how many recursive calls will be possible due to stack space limits.

• Can you give examples of overflow and truncation errors?

When calculating the midpoint on a large array, indices may be large values. Thus, mid = (lo + hi)/2 may result in an overflow. Change this to mid = lo + (hi-lo)/2. On a 2-element array, with lo=0 and hi=1, midpoint will remain at 0 due to truncation. If the match is at index 1, this will result in an infinite loop. The correct code in such a case would be mid = lo + (hi-lo+1)/2.

• What support do we have from languages to use or implement binary search?

While it's possible to do one's own implementation, many languages provide libraries that simplify implementation:

• C++ Standard Template Library: lower_bound, upper_bound, binary_search, equal_range
• Java: Arrays.binary_search
• .NET Framework: Array.BinarySearch
• Python: bisect

Sample Code

• //=====================================================================
// Recursive implementation
//=====================================================================
// Source: http://quiz.geeksforgeeks.org/binary-search/
// Accessed: 2017-03-22

// A recursive binary search function. It returns location of x in
// given array arr[l..r] is present, otherwise -1
int binarySearch(int arr[], int l, int r, int x)
{
if (r >= l)
{
int mid = l + (r - l)/2;

// If the element is present at the middle itself
if (arr[mid] == x)  return mid;

// If element is smaller than mid, then it can only be present
// in left subarray
if (arr[mid] > x) return binarySearch(arr, l, mid-1, x);

// Else the element can only be present in right subarray
return binarySearch(arr, mid+1, r, x);
}

// We reach here when element is not present in array
return -1;
}

//=====================================================================
// Iterative implementation
//=====================================================================
// Source: http://quiz.geeksforgeeks.org/binary-search/
// Accessed: 2017-03-22

// A iterative binary search function. It returns location of x in
// given array arr[l..r] if present, otherwise -1
int binarySearch(int arr[], int l, int r, int x)
{
while (l <= r)
{
int m = l + (r-l)/2;

// Check if x is present at mid
if (arr[m] == x)
return m;

// If x greater, ignore left half
if (arr[m] < x)
l = m + 1;

// If x is smaller, ignore right half
else
r = m - 1;
}

// if we reach here, then element was not present
return -1;
}

Milestones

1946

John Mauchly mentions binary search in what is probably the first published reference.

1960

D.H.Lehmer publishes a method that can handle any $$N$$, not just $$N=2^{n}-1$$.

1962

H.Bottenbruch changes the algorithm so that comparisons are reduced at the expense of an extra iteration.

1971

A.K.Chandra of Stanford University suggests the uniform binary search.

See Also

• Uniform Binary Search
• Interpolation Search
• Divide-and-Conquer Algorithms
• Big O Notation
• Algorithmic Complexity
• Recursion

Author
No. of Edits
No. of Chats
DevCoins
7
1
1631
4
0
33
1563
Words
1
Chats
11
Edits
9
Likes
5108
Hits

Cite As

Devopedia. 2020. "Binary Search." Version 11, January 6. Accessed 2020-04-06. https://devopedia.org/binary-search
• Site Map