Course
Iterators are objects that can be iterated upon. They serve as a common feature of the Python programming language, neatly tucked away for looping and list comprehensions. Any object that can derive an iterator is known as an iterable.
There is a lot of work that goes into constructing an iterator. For instance, the implementation of each iterator object must consist of an __iter__()
and __next__()
method. In addition to the prerequisite above, the implementation must also have a way to track the object's internal state and raise a StopIteration
exception once no more values can be returned. These rules are known as the iterator protocol.
Implementing your own iterator is a drawn-out process, and it is only sometimes necessary. A simpler alternative is to use a generator object. Generators are a special type of function that use the yield
keyword to return an iterator that may be iterated over, one value at a time.
The ability to discern the appropriate scenarios to implement an iterator or use a generator will improve your skills as a Python programmer. In the remainder of this tutorial, we will emphasize the distinctions between the two objects, which will help you decide the best one to use for various situations.
Glossary
Term |
Definition |
Iterable |
A Python object which can be looped over or iterated over in a loop. Examples of iterables include lists, sets, tuples, dictionaries, strings, etc. |
Iterator |
An iterator is an object that can be iterated upon. Thus, iterators contain a countable number of values. |
Generator |
A special type of function which does not return a single value: it returns an iterator object with a sequence of values. |
Lazy Evaluation |
An evaluation strategy whereby certain objects are only produced when required. Consequently, certain developer circles also refer to lazy evaluation as “call-by-need.” |
Iterator Protocol |
A set of rules that must be followed to define an iterator in Python. |
next() |
A built-in function used to return the next item in an iterator. |
iter() |
A built-in function used to convert an iterable to an iterator. |
yield() |
A python keyword similar to the return keyword, except yield returns a generator object instead of a value. |
Python Iterators & Iterables
Iterables are objects capable of returning their members one at a time – they can be iterated over. Popular built-in Python data structures such as lists, tuples, and sets qualify as iterables. Other data structures like strings and dictionaries are also considered iterables: a string can produce iteration of its characters, and the keys of a dictionary can be iterated upon. As a rule of thumb, consider any object that can be iterated over in a for-loop as an iterable.
Exploring Python iterables with examples
Given the definitions, we may conclude that all iterators are also iterable. However, every iterable is not necessarily an iterator. An iterable produces an iterator only once it is iterated on.
To demonstrate this functionality, we will instantiate a list, which is an iterable, and produce an iterator by calling the iter()
built-in function on the list.
list_instance = [1, 2, 3, 4]
print(iter(list_instance))
"""
<list_iterator object at 0x7fd946309e90>
"""
Although the list by itself is not an iterator, calling the iter()
function converts it to an iterator and returns the iterator object.
To demonstrate that not all iterables are iterators, we will instantiate the same list object and attempt to call the next()
function, which is used to return the next item in an iterator.
list_instance = [1, 2, 3, 4]
print(next(list_instance))
"""
--------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-2-0cb076ed2d65> in <module>()
3 print(iter(list_instance))
4
----> 5 print(next(list_instance))
TypeError: 'list' object is not an iterator
"""
In the code above, you can see that attempting to call the next()
function on the list raised a TypeError
– learn more about Exception and Error Handling in Python. This behavior occurred for the simple fact that a list object is an iterable and not an iterator.
Exploring Python iterators with examples
Thus, if the goal is to iterate on a list, then an iterator object must first be produced. Only then can we manage the iteration through the values of the list.
# instantiate a list object
list_instance = [1, 2, 3, 4]
# convert the list to an iterator
iterator = iter(list_instance)
# return items one at a time
print(next(iterator))
print(next(iterator))
print(next(iterator))
print(next(iterator))
"""
1
2
3
4
"""
Python automatically produces an iterator object whenever you attempt to loop through an iterable object.
# instantiate a list object
list_instance = [1, 2, 3, 4]
# loop through the list
for iterator in list_instance:
print(iterator)
"""
1
2
3
4
"""
When the StopIteration
exception is caught, then the loop ends.
The values obtained from an iterator can only be retrieved from left to right. Python does not have a previous()
function to enable developers to move backward through an iterator.
The lazy nature of iterators
It is possible to define multiple iterators based on the same iterable object. Each iterator will maintain its own state of progress. Thus, by defining multiple iterator instances of an iterable object, it is possible to iterate to the end of one instance while the other instance remains at the beginning.
list_instance = [1, 2, 3, 4]
iterator_a = iter(list_instance)
iterator_b = iter(list_instance)
print(f"A: {next(iterator_a)}")
print(f"A: {next(iterator_a)}")
print(f"A: {next(iterator_a)}")
print(f"A: {next(iterator_a)}")
print(f"B: {next(iterator_b)}")
"""
A: 1
A: 2
A: 3
A: 4
B: 1
"""
Notice iterator_b
prints the first element of the series.
Thus, we can say iterators have a lazy nature: when an iterator is created, the elements are not yielded until they are requested. In other words, the elements of our list instance would only be returned once we explicitly ask them to be with next(iter(list_instance))
.
However, all of the values from an iterator may be extracted at once by calling a built-in iterable data structure container (i.e., list()
, set()
, tuple()
) on the iterator object to force the iterator to generate all its elements at once.
# instantiate iterable
list_instance = [1, 2, 3, 4]
# produce an iterator from an iterable
iterator = iter(list_instance)
print(list(iterator))
"""
[1, 2, 3, 4]
"""
It’s not recommended to perform this action, especially when the elements the iterator returns are large since this will take a long time to process.
Whenever a large data file swamps your machine's memory, or you have a function that requires its internal state to be maintained upon each call but creating an iterator does not make sense given the circumstances, a better alternative is to use a generator object.
Python Generators
The most expedient alternative to implementing an iterator is to use a generator. Although generators may look like ordinary Python functions, they are different. For starters, a generator object does not return items. Instead, it uses the yield
keyword to generate items on the fly. Thus, we can say a generator is a special kind of function that leverages lazy evaluation.
Generators do not store their contents in memory as you would expect a typical iterable to do. For example, if the goal were to find all of the factors for a positive integer, we would typically implement a traditional function (learn more about Python Functions in this tutorial) as follows:
def factors(n):
factor_list = []
for val in range(1, n+1):
if n % val == 0:
factor_list.append(val)
return factor_list
print(factors(20))
"""
[1, 2, 4, 5, 10, 20]
"""
The code above returns the entire list of factors. However, notice the difference when a generator is used instead of a traditional Python function:
def factors(n):
for val in range(1, n+1):
if n % val == 0:
yield val
print(factors(20))
"""
<generator object factors at 0x7fd938271350>
"""
Since we used the yield
keyword instead of return
, the function is not exited after the run. In essence, we told Python to create a generator object instead of a traditional function, which enables the state of the generator object to be tracked.
Consequently, it is possible to call the next()
function on the lazy iterator to show the elements of the series one at a time.
def factors(n):
for val in range(1, n+1):
if n % val == 0:
yield val
factors_of_20 = factors(20)
print(next(factors_of_20))
"""
1
"""
Another way to create a generator is with a generator comprehension. Generator expressions adopt a similar syntax to that of a list comprehension, except it uses rounded brackets instead of squared.
print((val for val in range(1, 20+1) if n % val == 0))
"""
<generator object <genexpr> at 0x7fd940c31e50>
"""
Exploring Python’s yield
Keyword
The yield
keyword controls the flow of a generator function. Instead of exiting the function as seen when return
is used, the yield
keyword returns the function but remembers the state of its local variables.
The generator returned from the yield
call can be assigned to a variable and iterated upon with the next()
keyword – this will execute the function up to the first yield
keyword it encounters. Once the yield
keyword is hit, the execution of the function is suspended. When this occurs, the function's state is saved. Thus, it is possible for us to resume the function execution at our own will.
The function will continue from the call to yield
. For example:
def yield_multiple_statments():
yield "This is the first statment"
yield "This is the second statement"
yield "This is the third statement"
yield "This is the last statement. Don't call next again!"
example = yield_multiple_statments()
print(next(example))
print(next(example))
print(next(example))
print(next(example))
print(next(example))
"""
This is the first statment
This is the second statement
This is the third statement
This is the last statement. Don't call next again or else!
--------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-25-4aaf9c871f91> in <module>()
11 print(next(example))
12 print(next(example))
---> 13 print(next(example))
StopIteration:
"""
In the code above, our generator has four yield
calls, but we attempt to call next on it five times, which raised a StopIteration
exception. This behavior occurred because our generator is not an infinite series, so calling it more times than expected exhausted the generator.
Wrap-Up
To recap, iterators are objects that can be iterated on, and generators are special functions that leverage lazy evaluation. Implementing your own iterator means you must create an __iter__()
and __next__()
method, whereas a generator can be implemented using the yield keyword in a Python function or comprehension.
You may prefer to use a custom iterator over a generator when you require an object with complex state-maintaining behavior or if you wish to expose other methods beyond __next__()
, __iter__()
, and __init__()
. On the other hand, a generator may be preferable when dealing with large sets of data since they do not store their contents in memory or when it is not necessary to implement an iterator.
Top Python Courses
Course
Python Data Science Toolbox (Part 2)
Course
Intermediate Python
tutorial
Python Loops Tutorial
tutorial
Python Tuples Tutorial
tutorial
Python Descriptors Tutorial
tutorial