Skip to main content

Binary Search in Python: A Complete Guide for Efficient Searching

Learn how to implement binary search in Python using iterative and recursive approaches, and explore the built-in bisect module for efficient, pre-implemented binary search functions.
Aug 23, 2024  · 12 min read

Imagine you're playing a guessing game where you have to find a specific number between 1 and 100. You could try guessing randomly, but in the worst-case scenario, you may need 100 guesses to get to the correct answer.

A shorter method might be to pick the number in the middle, 50, and ask whether the target number is higher or lower. If your target number is higher, you can ignore everything below 50, and repeat this task with the numbers 51-100. You can continue until you find the correct number.

By halving the possibilities each time, you zero in on the target quickly. With this method, even in the worst-case scenario, it would only take at most 7 guesses to find the correct number. This strategy is the essence of binary search.

In this guide, we'll dive into what binary search is, its practical applications, and how to implement it in Python using both iterative and recursive methods. For a full-featured course, explore our Data Structures and Algorithms in Python course which explores binary search in detail, alongside other common and important searching algorithms like linear search, depth first search, and breadth first search.

What is Binary Search?

When searching for a value in a dataset, your goal is to find its index, or location, so you can easily retrieve and use that value in your code. Several search algorithms can help you locate the index of a specific value. One of the most efficient and fundamental methods is binary search.  

Build Machine Learning Skills

Elevate your machine learning skills to production level.

Start Learning for Free

To speak generally, an algorithm is a precise sequence of instructions that a computer follows to accomplish a specific task or solve a problem. Check out our blog post, What is an Algorithm, to learn about the many different kinds of algorithms in machine learning.

Concept overview

Binary search is a powerful algorithm designed to efficiently find a value in a sorted dataset. The core idea behind binary search is straightforward: Instead of checking each element in the dataset one by one, like in a linear search, binary search narrows down the search range by half with each step, making the process much faster.

Here’s how it works:

  1. Begin by comparing the target value with the middle element of the dataset. The index for the middle value is calculated using the formula: middle = (low + high) / 2, where low is the index of the first element in the current search range, and high is the index of the last element.
  2. Compare the middle value with the target. If the target value is equal to the middle element, you've found the index, and the search is complete. If the target value is smaller than the middle element, the search continues in the left half of the dataset. If the target value is larger, the search continues in the right half of the dataset.
  3. Repeat steps 1-2. The search range is continually halved at each step. Repeat the process until the target value is found or the search range becomes empty.

Example of using binary search on an array with 9 values.

The binary search process. Image by Author

Above is a simplified example demonstrating the concept behind binary search.

This halving process is what makes binary search so efficient. However, it’s important to note that the dataset must be sorted for binary search to work correctly. If the data isn’t sorted, the algorithm will not function as intended. 

Check out Data Structures: A Comprehensive Guide With Python Examples to learn more about different types of data structures you may want to search.

Key points to remember

The following points highlight the core principles of binary search.

Efficiency

Binary search is significantly faster than linear search, particularly when dealing with large datasets. While linear search has a time complexity of O(n), meaning it may have to check every single element in the worst case, binary search is more efficient. It has a time complexity of O(log n), meaning the search space is halved with each step, significantly reducing the number of comparisons needed.

For a detailed breakdown of how algorithms are evaluated, refer to our Big O Notation and Time Complexity Guide: Intuition and Math. An option is our Analyzing Complexity of Code through Python tutorial.

Requirements

For binary search to work, the dataset must be sorted in either ascending or descending order. This is a requirement because the algorithm relies on the order of elements to determine which half of the dataset to search next. If the data isn’t sorted, binary search cannot accurately locate the target value.

Flexibility

Binary search can be implemented using either an iterative or recursive approach. The iterative method uses loops to repeatedly halve the search range. The recursive method involves the function calling itself with a smaller search range. This flexibility lends itself well to different applications.

Real-World Applications

Binary search is a powerful tool. Its efficiency in narrowing down search ranges quickly makes it invaluable, particularly when dealing with large datasets where performance and speed are critical. Let's explore some specific applications and see how binary search stacks up against other algorithms.

Databases

In databases, binary search is often used to quickly locate records within sorted fields, such as finding a specific user in a database sorted by user ID. Imagine a database containing millions of records. A linear search would require scanning each record one by one, which could take considerable time. Instead, binary search can swiftly pinpoint the desired record by systematically halving the search space, significantly reducing the number of comparisons needed.

Data science

Searching through large, sorted datasets is a common task in data science. For instance, in time series analysis, binary search can be used to find specific timestamps within a sorted sequence of events. In machine learning, binary search algorithms can be used to optimize hyperparameters by finding the best value within a range.

Computer graphics

In computer graphics, binary search is used in algorithms where precision and speed are needed. One example is in ray tracing, a rendering technique used to simulate how light interacts with objects. Binary search can be used to quickly find intersection points between rays and surfaces.

Building blocks for complex algorithms

Binary search is not just useful on its own; it also serves as a building block for more complex algorithms and data structures. For example, search trees, such as binary search trees (BSTs), and balanced trees, like AVL trees, are based on the principles of binary search. These structures allow for efficient searching, insertion, and deletion operations, making them useful when data needs to be dynamically updated and searched frequently. Learn more about these trees in AVL Tree: Complete Guide With Python Implementation.

Binary Search Vs. Other Search Algorithms

Let’s explore how binary search compares to two other common search algorithms: linear search and hash lookup.

Linear search

Linear search is a straightforward algorithm that examines each element in a dataset sequentially. It is significantly less efficient than binary search, with a time complexity of O(n). However, linear search doesn’t require the data to be sorted, making it useful in some scenarios.

Hash lookup

Hash lookup is an efficient algorithm used for quickly finding values in a dataset when those values are associated with unique keys. It works by applying a hash function to the key, which computes an index where the corresponding value is stored in a hash table. This allows for near-instantaneous retrieval of values, often in O(1) time. However, while hash lookup is very fast for finding specific elements, it requires additional memory to store the hash table and is not suited for searching within a range of values, a task where binary search is more appropriate.

Implementing Binary Search in Python

Let’s explore a few ways of implementing a simple binary search in Python. But first, we need to establish a simple dataset to work with and a target value to search for. Imagine we have the following sorted array and search target:

arr = [2, 5, 8, 12, 16, 23, 38, 56, 72, 91]
target = 56

Iterative method

The iterative method is perhaps the most straightforward approach. Using this method, we use a while loop to repeatedly divide the search interval in half until we find our target. This method is often favored for its clarity and efficiency.

Here’s how you can implement binary search iteratively:

def binary_search_iterative(arr, target):
    # Define the search bounds
    left, right = 0, len(arr) - 1  
    while left <= right:
        # Calculate the middle index
        mid = left + (right - left) // 2  
        # If the middle element is the target, return its index
        if arr[mid] == target:
            return mid  
        # If the target is bigger, narrow the search to the right half
        elif arr[mid] < target: 
            left = mid + 1  
        # If the target is smaller, narrow the search to the left half
        else: 
            right = mid - 1  
    # Return -1 if the target is not found
    return -1 
    
# Run the iterative function
result = binary_search_iterative(arr, target)
if result != -1:
    print(f"Iterative: Target found at index {result}")
else:
    print("Iterative: Target not found")

Let’s take a closer look at this code:

  • We start by setting ‘left’ and ‘right’ as the bounds of the search space. Initially, left is 0 (the beginning of the array), and ‘right’ is len(arr) - 1 (the end of the array).

  • In each iteration, we calculate the mid index, which represents the middle of the current search interval. This is done using the formula mid = left + (right - left) /2.

  • We then compare the element at mid with target:

    • If they match, we’ve found our target, and the function returns mid.

    • If the element at mid is less than the target, it means the target must be in the right half, so we adjust left to mid + 1.

    • If the element at mid is greater than the target, the target must be in the left half, so we adjust right to mid - 1.

  • The loop continues until the target is found or until left exceeds right, which means the target isn’t in the array.

Recursive method

The recursive method for binary search is another way to implement this algorithm. Instead of using a loop, the function calls itself, adjusting the search bounds each time until it finds the target or determines that the target isn’t present.

Here’s how you can implement binary search recursively:

def binary_search_recursive(arr, target, left, right):
    # If the search bounds cross, the target isn't in the array
    if left > right:  
        return -1
    # Calculate the middle index
    mid = left + (right - left) // 2 
    # If middle value equals the target, return the index
    if arr[mid] == target: 
        return mid  
    # If the target is bigger than the middle value, search in the right half
    elif arr[mid] < target: 
        return binary_search_recursive(arr, target, mid + 1, right)  
    # If the target is smaller than the middle value, search in the left half
    else: 
        return binary_search_recursive(arr, target, left, mid - 1)  

# Run the recursive function
result = binary_search_recursive(arr, target, 0, len(arr) - 1)

if result != -1:
    print(f"Iterative: Target found at index {result}")
else:
    print("Iterative: Target not found")

Let’s take a closer look at this code:

  • The recursive function starts with the same initial left and right bounds as the iterative function.

  • It first checks if left exceeds right. If so, the function returns -1, indicating that the target isn’t in the array.

  • Otherwise, the function calculates the mid index and compares the target with the element at mid.

    • If the target equals the element at mid, the function returns mid.

    • If the target is greater than mid, the function recursively calls itself with updated bounds to search the right half.

    • If the target is less than mid, the function searches the left half.

  • The recursion continues until it finds the target or exhausts the search space.

For more on recursive functions, check out Understanding Recursive Functions in Python.

Using Python’s built-in bisect module

Python’s standard library includes the bisect module, which provides pre-implemented binary search functions. This module is highly efficient and can often save us time over building our own function.

Here’s how we can use the bisect module to find the target in our array:

# Import the bisect module
import bisect 

# Call the module and provide the array and the target value
index = bisect.bisect_left(arr, target) 

#Print results
if index < len(arr) and arr[index] == target:
    print(f"Bisect: Target found at index {index}")
else:
    print("Bisect: Target not found")

Let’s take a closer look at this code:

  • The bisect_left function returns the index where the target should be inserted to maintain the order of the array. If the target is found at this index, it means the target exists in the array.

  • This method is particularly useful when working with sorted arrays and can also be used to insert elements while maintaining the sorted order.

  • The bisect module also provides other functions like bisect_right and insort, which can be used for finding insertion points or inserting elements directly.

Become an ML Scientist

Upskill in Python to become a machine learning scientist.

Time and Space Complexity

When discussing the efficiency of an algorithm, we often use Big O notation, represented as O(x), to describe how the runtime or space requirements grow as the size of the input increases. For binary search, the time complexity is typically O(log n). This means that as the dataset grows, the number of operations needed to find the target increases logarithmically, making binary search efficient even for large datasets.

The iterative method of binary search has a time complexity of O(log n) because the search interval is halved with each iteration. Its space complexity is O(1) since it uses a constant amount of extra space, only requiring a few variables to track the search bounds and the middle element.

The recursive method also has a time complexity of O(log n) for the same reason. However, its space complexity is O(log n) due to the space needed to maintain the call stack for each recursive call. As each recursive call adds a layer to the stack, the depth of recursion is proportional to the number of halving steps, which is logarithmic in relation to the size of the dataset.

While both methods are efficient in terms of time, the iterative approach is more space-efficient. This is why the bisect module uses an iterative approach under the hood. Personally, I prefer the iterative approach for this reason as well. However, the recursive approach can be more intuitive for some people and may require fewer lines of code.

Common Pitfalls and How to Avoid Them

When using a binary search, there are a few common pitfalls to watch out for. These can impact the correctness and efficiency of your algorithm, so it’s important to take note of them.

Firstly, binary search relies on the dataset being sorted. If the data is not sorted, binary search will not function correctly; it might return incorrect results or fail altogether. Therefore, it’s essential to sort our dataset before applying binary search. If sorting the data is not feasible, a binary search is not the appropriate tool.

A frequent issue is off-by-one errors. These errors occur when the index calculations are slightly incorrect, potentially leading to infinite loops or failing to locate the target value. This can happen if the midpoint calculation or the boundary adjustments are not handled precisely. To avoid this, ensure that the calculations for the middle index are correct and that the bounds are updated properly after each comparison. Remember, index values in Python start at 0, not 1.

Another challenge to be aware of if you use binary search recursively is recursion depth issues. In Python, there is a limit to the number of recursive calls that can be made, to prevent excessive memory use. If your dataset is very large, deep recursion might exceed this limit, causing a stack overflow. To mitigate this, consider using an iterative approach for binary search, which does not suffer from recursion depth limitations. If recursion is preferred or necessary, you may be able to increase the recursion limit using sys.setrecursionlimit(), but this should be done with caution to avoid other potential issues.

Conclusion

Binary search is a powerful and efficient algorithm that should be in every Python developer’s toolkit. Whether implemented iteratively or recursively, it offers a significant performance advantage over linear search, especially with large datasets. For more, check out DataCamp’s Python Programming skill track or Software Engineering Principles in Python interactive course.


Photo of Amberle McKee
Author
Amberle McKee
LinkedIn

I am a PhD with 13 years of experience working with data in a biological research environment. I create software in several programming languages including Python, MATLAB, and R. I am passionate about sharing my love of learning with the world.

Binary Search FAQs

What are the common real-world scenarios where binary search might be inefficient?

When data isn't sorted or frequently updated, sorting can slow things down, making binary search less efficient.

Can binary search be used on data structures other than arrays, such as linked lists?

No, because accessing the middle of a linked list takes linear time, negating binary search's speed.

How does binary search handle duplicate values in a dataset?

Binary search finds one occurrence. To find all duplicates, additional searches for the bounds are needed.

What is the advantage of using the bisect module in Python over writing a custom binary search function?

The bisect module is optimized, reliable, and has added features like insertion while keeping the dataset sorted.

How can binary search be applied in non-numeric datasets, such as searching through sorted text or strings?

Binary search works for sorted text by comparing items based on lexicographical order.

 
Topics

Learn Python with DataCamp

course

Generalized Linear Models in Python

5 hr
9.5K
Extend your regression toolbox with the logistic and Poisson models and learn to train, understand, and validate them, as well as to make predictions.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

tutorial

How to Sort a Dictionary by Value in Python

Learn efficient methods to sort a dictionary by values in Python. Discover ascending and descending order sorting and bonus tips for key sorting.
Neetika Khandelwal's photo

Neetika Khandelwal

5 min

tutorial

Python Tutorial for Beginners

Get a step-by-step guide on how to install Python and use it for basic data science functions.
Matthew Przybyla's photo

Matthew Przybyla

12 min

tutorial

How to Write Memory-Efficient Classes in Python

Learn about memory management in Python with advanced techniques for coding memory-efficient classes. Explore practical exercises for optimal performance.
Arunn Thevapalan's photo

Arunn Thevapalan

30 min

tutorial

Understanding Recursive Functions in Python

In this tutorial, learn about the different aspects of recursive functions and implement a recursive function in Python from scratch.
Sayak Paul's photo

Sayak Paul

12 min

tutorial

AVL Tree: Complete Guide With Python Implementation

An AVL tree is a self-balancing binary search tree where the height difference between the left and right subtrees of any node is at most one, ensuring efficient operations.
François Aubry's photo

François Aubry

18 min

tutorial

Sorting in Python Tutorial

Discover what the sort method is and how to do it.
DataCamp Team's photo

DataCamp Team

1 min

See MoreSee More