Skip to main content
HomeAbout PythonLearn Python

Lists: N-Sized Chunks

In this tutorial, you shall work with lists and learn an efficient way to divide arbitrarily sized lists into chunks of a given size.
Sep 2018  · 5 min read

Lists are inbuilt data structures in Python that store heterogeneous items and enable efficient access to these items. The task at hand, dividing lists into N-sized chunks is a widespread practice when there is a limit to the number of items your program can handle in a single request.

Lists

Lists are data structures that can hold mixed values or items within itself. Examples of items are integers, floats, strings, etc. Lists are mutable, which means you can change the content of a list without actually changing its identity. They are written with square brackets [ ], and the items within it are separated by a comma (,). Let's create a list of numbers that we can work with...

# Creating a list of 95 numbers
list_numbers = list(range(1, 96))
print(list_numbers)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95]

The range() function generates a list of numbers. It takes the form: range([start], stop[, step]) where 'start' and 'step' are optional parameters. start is the starting number in the sequence, stop is the number it generates numbers up to, but not including this number. step is the difference between each number in the sequence. This function is 0 indexed, meaning the list indexes start at 0, not 1 when the start is not specified.

Now, let's say you need to break down the list into smaller lists, each having 5 elements each.

One way to do this is by defining a generator. A generator is an elegant way to define an iterator. What is an iterator you ask? To put into simple words - iterator is an object that knows how to compute and return the next item in the object that you are iterating through. You can read more about iterators and generators in DataCamp's Python Iterator Tutorial.

We define a function that holds the generator that actually does the work for us.

 # Yields successive 'n' sized chunks from list 'list_name'
def create_chunks(list_name, n):
    for i in range(0, len(list_name), n):
        yield list_name[i:i + n]

# Call the 'create_chunks' function to divide the list further into sub-lists of 10 items each
print(list(create_chunks(list_numbers, 10)))

[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [11, 12, 13, 14, 15, 16, 17, 18, 19, 20], [21, 22, 23, 24, 25, 26, 27, 28, 29, 30], [31, 32, 33, 34, 35, 36, 37, 38, 39, 40], [41, 42, 43, 44, 45, 46, 47, 48, 49, 50], [51, 52, 53, 54, 55, 56, 57, 58, 59, 60], [61, 62, 63, 64, 65, 66, 67, 68, 69, 70], [71, 72, 73, 74, 75, 76, 77, 78, 79, 80], [81, 82, 83, 84, 85, 86, 87, 88, 89, 90], [91, 92, 93, 94, 95]]

Another way to do the same is to merely use list comprehension. You can read more about it in DataCamp's Python List Comprehension Tutorial.

print([list_numbers[i: i+10] for i in range(0, len(list_numbers), 10)])

[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [11, 12, 13, 14, 15, 16, 17, 18, 19, 20], [21, 22, 23, 24, 25, 26, 27, 28, 29, 30], [31, 32, 33, 34, 35, 36, 37, 38, 39, 40], [41, 42, 43, 44, 45, 46, 47, 48, 49, 50], [51, 52, 53, 54, 55, 56, 57, 58, 59, 60], [61, 62, 63, 64, 65, 66, 67, 68, 69, 70], [71, 72, 73, 74, 75, 76, 77, 78, 79, 80], [81, 82, 83, 84, 85, 86, 87, 88, 89, 90], [91, 92, 93, 94, 95]]

As the problem gets more complicated - list comprehension statements can get more and more complex to understand and debug. Thus, writing a clean function such as with the generator can be more useful and easier to keep track of.

In this tutorial, you have learned two ways to solve a rather frequent problem when dealing with lists. Check out DataCamp's Data Types for Data Science course.

Python courses

Certification available

Introduction to Python

BeginnerSkill Level
4 hr
5.2M
Master the basics of data analysis with Python in just four hours. This online course will introduce the Python interface and explore popular packages.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

Google Cloud for Data Scientists: Harnessing Cloud Resources for Data Analysis

How can using Google Cloud make data analysis easier? We explore examples of companies that have already experienced all the benefits.
Oleh Maksymovych's photo

Oleh Maksymovych

9 min

A Guide to Docker Certification: Exploring The Docker Certified Associate (DCA) Exam

Unlock your potential in Docker and data science with our comprehensive guide. Explore Docker certifications, learning paths, and practical tips.
Matt Crabtree's photo

Matt Crabtree

8 min

Bash & zsh Shell Terminal Basics Cheat Sheet

Improve your Bash & zsh Shell skills with the handy shortcuts featured in this convenient cheat sheet!
Richie Cotton's photo

Richie Cotton

6 min

Functional Programming vs Object-Oriented Programming in Data Analysis

Explore two of the most commonly used programming paradigms in data science: object-oriented programming and functional programming.
Amberle McKee's photo

Amberle McKee

15 min

A Comprehensive Introduction to Anomaly Detection

A tutorial on mastering the fundamentals of anomaly detection - the concepts, terminology, and code.
Bex Tuychiev's photo

Bex Tuychiev

14 min

Pandas Profiling (ydata-profiling) in Python: A Guide for Beginners

Learn how to use the ydata-profiling library in Python to generate detailed reports for datasets with many features.
Satyam Tripathi's photo

Satyam Tripathi

9 min

See MoreSee More