Skip to main content

Python yield Keyword: What Is It and How to Use It?

The yield keyword in Python turns a regular function into a generator, which produces a sequence of values on demand instead of computing them all at once.
Jul 10, 2024

Python functions don't always have a return statement. Generator functions are functions that have the yield keyword instead of return.

These functions produce generator iterators, which are objects that represent a stream of data. The elements represented by an iterator are created and yielded only when required. This type of evaluation is often referred to as lazy evaluation.

When dealing with large datasets, generators offer a memory-efficient alternative to storing data in lists, tuples, and other data structures that require space in memory for each of their elements. Generator functions can also create infinite iterators, which are not possible with eagerly evaluated structures like lists and tuples.

Before we begin, let’s recap the differences between functions and generators:

Feature

Function

Generator

Value Production

Returns all values at once

Yields values one at a time, on demand

Execution

Executes completely before returning

Pauses after yielding, resumes when next value is requested

Keyword

return

yield

Memory Usage

Potentially high, stores entire sequence in memory

Low, stores only current value and state for next

Iteration

Multiple iterations possible, but requires storing the entire sequence

Designed for single-pass iteration, more efficient for large or infinite sequences

Using Python's yield to Create Generator Functions

The term generator in Python can refer to a generator iterator or a generator function. These are different but related objects in Python. In this tutorial, the full terms are used often to avoid confusion.

Let's explore generator functions first. A generator function looks similar to a regular function but contains the yield keyword instead of return.

When a Python program calls a generator function, it creates a generator iterator. Iterators yield a value on demand and pause their execution until another value is required. Let's look at an example to explain this concept and demonstrate the difference between regular functions and generator functions.

Using a regular function

First, let's define a regular function, which contains a return statement. This function accepts a sequence of words and a letter, and it returns a list containing the number of occurrences of the letter in each word:

def find_letter_occurrences(words, letter):
    output = []
    for word in words:
        output.append(word.count(letter))
    return output
print(
    find_letter_occurrences(["apple", "banana", "cherry"], "a")
)
[1, 3, 0]

The function outputs a list containing 1, 3, and 0 since there's one a in apple, three occurrences of a in banana, and none in cherry. The same function can be refactored to use list comprehensions instead of initializing an empty list and using .append():

def find_letter_occurrences(words, letter):
    return [word.count(letter) for word in words]

This regular function returns a list containing all the results whenever it's called. However, if the list of words is large, calling this regular function puts demands on memory requirements since the program creates and stores a new list of the same size as the original one. If this function is used repeatedly on several input arguments, or similar functions are performing other operations on the original data, the pressure on memory can increase rapidly.

Using a generator function

A generator function can be used instead:

def find_letter_occurrences(words, letter):
    for word in words:
        yield word.count(letter)
words = ["apple", "banana", "cherry"]
letter = "a"
output = find_letter_occurrences(words, letter)
print(output)
<generator object find_letter_occurrences at 0x102935e00>

The function includes the yield keyword instead of return. This generator function returns a generator object when called, which is assigned to output. This object is an iterator. It doesn't contain the data representing the number of occurrences of the letter in each word. Instead, the generator will create and yield the values when needed. Let's fetch the first value from this generator iterator:

print(next(output))
1

The built-in function next() is one way of getting the next value from an iterator. We'll look at other ways later in this tutorial.

The code in the generator function is executed until the program reaches the line with the yield keyword. In this example, the for loop starts its first iteration and fetches the first element in the list words. The .count() string method returns an integer, which is 1 in this case, as there's one occurrence of a in apple. The generator yields this value, which is returned by next(output).

The generator output pauses its execution at this point. Therefore, the generator completed the first iteration of the for loop and found the number of occurrences of the letter a in the first word in the list of words. It's now waiting until it's needed again.

If the built-in next() is called again with output as its argument, the generator will resume the execution from the point where it's paused:

print(next(output))
3

The generator continues from the line with yield in the for loop's first iteration. As the for loop doesn't have further lines of code, it returns to the top of the loop and fetches the second element from the list words. The value returned by .count() is 3 in this case, and this value is yielded. The generator pauses again at this point of execution.

The third call to next() resumes this execution:

print(next(output))
0

The second iteration reaches the end of the for loop, which moves to the third iteration. The code progresses to the line with yield again, this time yielding the integer 0 since there aren't any occurrences of a in cherry.

The generator pauses again. The program only determines the generator's fate when we call next() a fourth time:

print(next(output))
Traceback (most recent call last):
  ...
StopIteration

The execution resumes from the end of the for loop's third iteration. However, the loop has reached the end of its iteration since there are no more elements in the list words. The generator raises a StopIteration exception.

In most use cases, the generator elements are not accessed directly using next() but through another iteration process. The StopIteration exception signals the end of the iteration process. We'll explore this further in the following section of this tutorial.

Python has another way of creating generator iterators when their operation can be represented by a single expression, as in the previous example. The generator iterator output can be created using a _generator expression_:

words = ["apple", "banana", "cherry"]
letter = "a"
output = (word.count(letter) for word in words)
print(next(output))
print(next(output))
print(next(output))
print(next(output))
1
3
0
Traceback (most recent call last):
  ...
StopIteration

The expression in parentheses assigned to output is a generator expression, which creates a similar generator iterator to the one produced by the generator function find_letter_occurrences().

Let's conclude this section with another example of a generator function to highlight how execution pauses and resumes each time an element is needed:

def show_status():
    print("Start")
    yield
    print("Middle")
    yield
    print("End")
    yield
status = show_status()
next(status)
Start

This generator function doesn't have a loop. Instead, it contains three lines that have the yield keyword. The code creates a generator iterator status when it calls the generator function show_status(). The first time the program calls next(status), the generator starts execution. It prints the string "Start" and pauses after the first yield expression. The generator yields None since there's no object following the yield keyword.

The program prints the string "Middle" only when next() is called a second time:

next(status)
Middle

The generator pauses after the second yield expression. The third call to next() prints the final string, "End":

next(status)
End

The generator pauses on the final yield expression. It will raise a StopIteration exception the next time the program requests a value from this generator iterator:

next(status)
Traceback (most recent call last):
  ...
StopIteration

We'll explore more ways of using generators in the following section.

Working With Generator Iterators

Generator functions create generator iterators, and iterators are iterable. Let's unpack this phrase. Every time the program calls a generator function, it creates an iterator. Since iterators are iterable, they can be used in for loops and other iterative processes.

Therefore, the next() built-in function isn't the only way to access elements in an iterator. This section explores other ways of working with generators.

Using Python's iteration protocol with generator iterators

Let's revisit a generator function from an earlier section in this tutorial:

def find_letter_occurrences(words, letter):
    for word in words:
        yield word.count(letter)
words = ["apple", "banana", "cherry"]
letter = "a"
output = find_letter_occurrences(words, letter)
for value in output:
    print(value)
1
3
0

Instead of using next() several times, this version of the code uses the generator iterator output in a for loop. Since iterators are iterable, they can be used in for loops. The loop fetches items from the generator iterator until there aren't any values left.

Unlike data structures such as lists and tuples, an iterator can only be used once. The code doesn't print out the values again if we try to run the same for loop a second time:

def find_letter_occurrences(words, letter):
    for word in words:
        yield word.count(letter)
words = ["apple", "banana", "cherry"]
letter = "a"
output = find_letter_occurrences(words, letter)
print("First attempt:")
for value in output:
    print(value)
print("Second attempt:")
for value in output:
    print(value)
First attempt:
1
3
0
Second attempt:

The iterator is exhausted by the first for loop, so it can no longer yield values. If the generator is needed again after it is exhausted, we must create another generator iterator from the generator function.

It's also possible to have several generator iterators existing at the same time in a program:

 
def find_letter_occurrences(words, letter):
    for word in words:
        yield word.count(letter)
words = ["apple", "banana", "cherry"]
letter = "a"
first_output = find_letter_occurrences(words, letter)
second_output = find_letter_occurrences(words, letter)
print("First value of first_output:")
print(next(first_output))
print("Values of second_output:")
for value in second_output:
    print(value)
print("Remaining values of first_output:")
for value in first_output:
    print(value)
First value of first_output:
1
Values of second_output:
1
3
0
Remaining values of first_output:
3
0

The generator function find_letter_occurrences() creates two generator iterators: first_output and second_output. Although both iterators refer to the same data in the list words, they progress independently of each other.

This example fetches the first value from first_output using next(). The generator iterator yields 1 and pauses at this point. The program loops through second_output next. Since this generator hasn't yielded any values yet, the loop goes through all the values yielded by the second iterator. Finally, there's another for loop iterating through first_output. However, this iterator has already yielded its first value earlier in the program. The loop goes through the remaining values in this iterator.

The for loop isn't the only process that can be used for iterating through generator iterators:

print(*find_letter_occurrences(words, letter))
print(sorted(find_letter_occurrences(words, letter)))
1 3 0
[0, 1, 3]

In these examples, the program calls the generator function directly to create and use the generator iterator instead of assigning it to a variable. In the first example, the iterator is unpacked using the star notation. This process relies on the same iteration protocol as the for loop.

In the second example, the generator iterator is passed to the built-in sorted(), which requires an iterable argument. Generators are iterable, and therefore, they can be used whenever Python's iteration occurs.

Creating infinite iterators

A generator yields a value and pauses until the next value is needed. Each time the code requests a value from an iterator, the code within the generator function will execute until the next yield expression is evaluated. In all the examples in this tutorial so far, the generator function had a finite number of yield expressions. However, it's possible to create a generator that yields an infinite number of values by using a while loop in the generator function. In the following example, the generator yields a random color from the list of colors passed to the generator function:

import random
def get_color(colors):
    while True:
        yield random.choice(colors)
output_colors = get_color(["red", "green", "blue"])
print("First two colors:")
print(next(output_colors))
print(next(output_colors))
print("Next 10 colors using a 'for' loop:")
for _ in range(10):
    print(next(output_colors))
First two colors:
green
red
Next 10 colors using a 'for' loop:
blue
green
green
green
red
red
red
blue
green
red

The generator function get_color() has a yield expression within a while loop. Therefore, the code will always encounter another yield expression when looking for the next value. The generator iterator output_colors yields an infinite number of colors chosen at random from the input list. This generator will never be exhausted.

It's not possible to create infinite data structures such as lists and tuples. Generators enable a program to create infinite iterables. Note that if the generator iterator is used directly within a for loop, the loop will run forever.

Advanced Generator Concepts

Generators have more advanced use cases in Python. This section will explore some of these.

Sending an object into the generator

Generators can also accept additional data that can be used while evaluating code. The statement containing the yield keyword is an expression that evaluates to a value. This value can be assigned to a variable within the generator function. Let's start with a basic example to demonstrate this concept:

def generator_function():
    value = yield 1
    print(f"The yield expression evaluates to: {value}")
    value = yield 2
    print(f"The yield expression evaluates to: {value}")
output = generator_function()
print(next(output))
print(next(output))
print(next(output))
1
The yield expression evaluates to: None
2
The yield expression evaluates to: None
Traceback (most recent call last):
  ...
StopIteration

The Python yield keyword creates an expression that evaluates to a value. However, this expression's value within the generator function is not the same object yielded by the generator. Consider the first yield expression. The generator yields the integer 1. Therefore, print(next(output)) displays 1 the first time it's called and pauses the execution of the generator.

However, the yield expression in the generator evaluates to an object, which the code assigns to the variable name value. In this example, yield assigns None to value. This process is repeated for the second occurrence of yield in the generator function. The purpose of the third next() call is to ensure all the code in the generator function is executed.

Let's replace the second and third calls to next() with .send(), which is a method in the generator class:

def generator_function():
    value = yield 1
    print(f"The yield expression evaluates to: {value}")
    value = yield 2
    print(f"The yield expression evaluates to: {value}")
output = generator_function()
print(next(output))
print(output.send("Here's a value"))
print(output.send("Here's another value"))
1
The yield expression evaluates to: Here's a value
2
The yield expression evaluates to: Here's another value
Traceback (most recent call last):
  ...
StopIteration

The generator function is unchanged. The generator is started by calling next(), and the code runs until it yields the first integer, 1. Instead of using next() the second time, the program calls output.send(). This method sends an object into the generator. In this example, the object is a string. The yield expression in the generator function evaluates to this string, which is assigned to value. Therefore, the generator can use the string within its code.

The second call to .send() sends a new object to the generator, which is assigned to the same variable value. The generator raises a StopIteration after the final print() call since there are no more yield expressions.

Let's look at another example using .send(). The following generator displays the balance in an account, but the balance can be updated:

def get_balance(start_balance):
    balance = start_balance
    while True:
        amount = yield balance
        if amount is not None:
            balance += amount
current_balance = get_balance(100)
print(next(current_balance))
print(current_balance.send(10))
print(current_balance.send(-20))
print(next(current_balance))
100
110
90
90

The generator function requires a starting balance when called. The value of balance can change as the generator executes the code. Any object sent into the generator using .send() is assigned to amount. This variable will either be None if the generator yields a value without any object sent to it, or it will contain the object sent using .send().

The generator iterator current_balance starts with a balance of $100. The generator is started by calling next(), which starts executing the code until the first value is yielded.

Once the generator has started, it's possible to restart the execution using .send() instead of next(). The generator adds the value sent to the balance. If no value is sent, such as by calling next() again, the generator yields the unchanged balance.

Yielding directly from another iterable

Python generators can also yield values directly from another generator or iterable using the yield from syntax. Let's look at an example of a generator function which yields values from a nested list:

def flatten(nested_list):
    for item in nested_list:
        if isinstance(item, list):
            yield from flatten(item)
        else:
            yield item
nested_list = [1, [2, 3], [4, [5, 6]], 7]
print(list(flatten(nested_list)))
[1, 2, 3, 4, 5, 6, 7]

The generator function accepts a list, which can include nested lists within it. The for loop iterates through the items in the list. Each item in the outer list is either a value, in this case, an integer, or another list. When the item isn't a list, the generator yields the item.

However, when the item is a list, the generator recursively calls the generator function flatten() again with the inner list as an argument. This creates another generator iterator, which uses the inner list as its data source. If this line used a yield expression, the first generator would yield the second generator. Instead, by using yield from, the first generator yields values from the second generator.

Summary: yield vs. return

Function definitions with return and yield look similar, but their behavior is different. Let's summarise the key differences:

 

Regular Function

Generator Function

Keyword

return (implicit if not explicitly used)

yield

Called

Executes code until return is reached, then returns the final value

Creates a generator iterator

Termination

Terminated by the return statement

Paused by yield, can be resumed later

Return Value

Single object (can be a data structure)

Generator iterator

Yield Expression

Not applicable (creates a statement)

Evaluates to None or the value sent using .send()

Use Cases

Ideal for returning a final result

Ideal for creating a stream of data, especially large or infinite sequences

Conclusion

Python's yield keyword is used in functions to define a generator function. When called, these functions create generator iterators. Generators are an example of lazy evaluation in Python, where expressions are evaluated when the value is required instead of when executing the expression. Therefore, the yield expression is useful to create a stream of data where the values are generated on demand without the need to store them in memory.

Efficiency considerations are important when dealing with large datasets that require many operations. Python's generator iterators are one of the principal tools required to manipulate large amounts of data efficiently.

If you want to learn more about Python, check out this Python Developer career track.


Photo of Stephen Gruppetta
Author
Stephen Gruppetta
LinkedIn
Twitter

I studied Physics and Mathematics at UG level at the University of Malta. Then, I moved to London and got my PhD in Physics from Imperial College. I worked on novel optical techniques to image the human retina. Now, I focus on writing about Python, communicating about Python, and teaching Python.

Topics

Learn Python with these courses!

course

Introduction to Python for Developers

3 hr
18.1K
Master the fundamentals of programming in Python. No prior knowledge required!
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

tutorial

Python Iterators and Generators Tutorial

Explore the difference between Python Iterators and Generators and learn which are the best to use in various situations.
Kurtis Pykes 's photo

Kurtis Pykes

10 min

tutorial

Python Tuples Tutorial

Learn about Python tuples: what they are, how to create them, when to use them, what operations you perform on them and various functions you should know.
Sejal Jaiswal's photo

Sejal Jaiswal

10 min

tutorial

Python Print() Function

Learn how you can leverage the capability of a simple Python Print function in various ways with the help of examples.
Aditya Sharma's photo

Aditya Sharma

10 min

tutorial

Python lambda Tutorial

Learn a quicker way of writing functions on the fly with lambda functions.
DataCamp Team's photo

DataCamp Team

3 min

tutorial

Python Functions: How to Call & Write Functions

Discover how to write reusable and efficient Python functions. Master parameters, return statements, and advanced topics like lambda functions. Organize your code better with main() and other best practices.
Karlijn Willems's photo

Karlijn Willems

14 min

tutorial

Python Enumerate Tutorial

Discover what the Python enumerate function is and how you can use it in your day-to-day data science tasks.
Elena Kosourova's photo

Elena Kosourova

10 min

See MoreSee More